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.used, but states that there remains no description of specific embodiments of the claimed 
invention other than ones featuring E. coli. The Office Action states that "no description is 
provided, for example, for tRNA genes corresponding to rarely used codons in plant cells or 
protozoa, two large classes of cell types embraced by the claims (i.e., rarely used codons)." The 
Office Action also states that the prior art does not appear to provide teachings as to which of the 
many known tRNA genes correspond to "rarely used" codons for the many different cell types 
encompassed by the claims, and that there is no evidence of record that rare codon patterns have 
been established for a sufficient number of cell types for one of skill in the art to be able to 
envision a sufficient number of specific embodiments of the invention to describe the very 
broadly claimed genus. Finally, with respect to Written Description, the Office Action states 
that there remains no evidence of record to indicate that a sufficient number of tRNA genes 
obtained from different cell types corresponding to rarely used codons of different cell types 
were known in the prior art for one of skill in the art to envision a sufficient number of 
embodiments of the claimed vectors and host cells to describe the broad genus of host cells and 
vectors encompassed by the claims. Applicants respectfully disagree. 

Under the law, in order to satisfy the written description requirement, a patent 
specification must describe the claimed invention in sufficient detail that one skilled in the art 
can reasonably conclude that the inventor had possession of the claimed invention. Vas-Cath, 
Inc. v. Mahurkar, 935 F.2d 1555, 1563; M.P.E.P, §2163. Further, under both the written 
description and enablement requirements, one need not describe in detail that which is well 
known in the art. M.P.E.P §2163, citing Hybritech v. Monoclonal Antibodies, 802 F.2d 1367, 
1384. The Office Action states that "the critical elements of applicant's invention are the tRNA 
genes corresponding to rarely used codons on the claimed vectors which are determined by the 
combination of host cell type (i.e., rarely used codons), the corresponding tRNA genes and the 
protein to be expressed." Applicants submit that patterns of codon usage were well known to 
those skilled in the art at the time of filing. For example, Nakamura et al., 1996, Nucleic Acids 
Res. 24: 214-215 (Exhibit A) provides codon usage tabulated from the GenBank international 
DNA sequence databases for 4,805 species. Applicants submit that these species include 
prokaryotes, protozoa and fungi, and a wide variety of higher eukaryotes, including animals and 
plants. The codon usage tabulation of Nakamura necessarily details low, as well as high codon 
usage in each of the thousands of species examined. 
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Further, Zhang et al., 1991, Gene 105: 61-72 (Exhibit B) details low usage codons in 
species as diverse as E. coli (prokaryotic), yeast (S. cerevisiae; protozoan), Drosophila and 
eleven species of primates. The paper describes the unique combination of least-used codons for 
the species examined. 

In addition, Saier (1995, FEBS Lett. 362: 1-4; Exhibit C) describes rare codon usage in 
species including Rhodobacter capsulatus, R. spheroides, Clostridium acetobutylicum, 
Streptomyces coelicolor and E. coli. Saier relates the rare codon usage to the regulation of 
metabolically sensitive genes. 

In view of these sources of information, particularly in view of the comprehensive nature 
of the Nakamura database before the filing of the subject application, Applicants submit that a 
sufficient number of rare codon usage patterns have been established in the prior art for a 
sufficient number of cell types for one of skill in the art to readily envision a sufficient number of 
specific embodiments of rare codon usage patterns to describe the claimed genus. 

With regard to tRNA genes, Applicants submit that a wide variety of tRNA genes was 
also known in the art at the time of filing. For example, as early as 1984, Sprinzl & Gauss 
described a compilation of 353 sequences of tRNA genes including cellular and mitochondrial 
tRNAs from bacteria and phage, plants, yeasts and fungi, insects, amphibians and mammals, 
including rats, mice, cows and humans (Nucleic Acids Res. 12 Suppl.: r59-131; Exhibit D). 
Further, there was available on the World Wide Web as of the end of 1998 (before filing), a 
compilation of tRNA sequences and genes including 3279 sequences. This number is taken from 
documentation on the current WWW compilation at www.uni- 

bayreuth.de/departments/biochemie/trna (see Exhibit E). The thousands of tRNA genes 
described include those for rarely used tRNAs. Additional rare tRNAs are described in, for 
example: Kawakami et al., 1993, Genetics 135: 309-320 (Exhibit F), which describes a rare 
Arg-tRNA-CCU in Saccharomyces cerevisiae; and Clouthier et al., 1998, J. Bacterid. 180: 840- 
845 (Exhibit G), which describes the rare Arg-tRNA-AGA in Salmonella enteritidis. Applicants 
therefore submit that there was known in the art, at the time of filing, a large number of tRNA 
genes, including those encoding rare tRNAs, from a broad cross section of species. 
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• J The Office Action states that the prior art does not appear to provide teachings as to 
which of the many known tRNA genes correspond to "rarely used" codons for the many 
different cell types encompassed by the claimed invention. Applicants submit that given the 
extensive data on codon usage available in the art (e.g., Exhibits A-C), one of skill in the art 
would know if a given tRNA gene, e.g., one described in any of Exhibits D, F or G, corresponds 
to a rarely used codon. 

In view of the above, and given the description provided in the specification, Applicants 
submit that the invention of claims 1-16 and 18-44 is described in sufficient detail to enable one 
of skill in the art to envision a sufficient number of embodiments of the claimed vectors and host 
cells to describe the full scope of the claimed genus of vectors and host cells. Applicants 
respectfully request the withdrawal of the §112, first paragraph written description rejection of 
claims 1-16 and 18-44. 

Rejections under 35 U.S.C. §103 : 

All claims remain rejected under 35 U.S.C. §103 as obvious over Del Tito et al. in 
combination with one or more of Makoff et al., the 1997 Novagen catalog, and Wnendt. The 
Office Action rejects the evidence of commercial success as an objective indicator of non- 
obviousness because "there remains no meaningful background against which the sales figures 
presented in Paper No. 10 can be weighed to determine if the demonstrated sales are so 
indicative of commercial success as to make the claimed invention unobvious." The Office 
Action further states that there needs to be a showing that the commercial success is 
commensurate with the claimed invention, and that "A demonstration of commercial success for 
a couple of specific embodiments useful in E. coli cannot be considered as evidence of 
nonobviousness commensurate with the full, broadly claimed genus of host cells and vector of 
the instant invention." Applicants respectfully disagree. 

First, Applicants submit that the law does not absolutely require evidence of market share 
in order for commercial sales of a product of an invention to be persuasive of nonobviousness. 
The Federal Circuit has held that in order to demonstrate non-obviousness, the commercial 
success of the product must be due to the merits of the claimed invention beyond what was 
readily available in the prior art. Richdel, Inc., v. Sunspool Corp., 714 F.2d 1573 (Fed. Cir. 
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•1983). With regard to whether the commercial success is due to the merits of the claimed 
invention, in J.T. Eaton & Co., Inc. v. Atlantic Paste & Glue Co. the Federal Circuit held that a 
primary showing of commercial success limited to sales, coupled with a demonstration that the 
commercial success of the product derives from the claimed invention and is attributable to 
something disclosed in the patent that was not readily available in the prior art is entitled to the 
presumption that that the commercial success of the product is attributable to the patented 
invention. J.T. Eaton & Co., Inc. v. Atlantic Paste & Glue Co., 106 F.3d 1563, 1571 (Fed. 
Cir.1997). 

Applicants submit, and the Buchanan Declarations support the conclusion, that the 
products sold are embodiments of the claimed invention, falling plainly within the scope of claim 
1. This claim requires a host cell containing a recombinant DNA molecule which comprises an 
array of three of more tRNA genes, wherein the tRNA genes correspond to codons that are rarely 
used in the host cell. As stated in Ms. Buchanan's first Rule 132 Declaration, each of the 
competent cell products for which sales figures are reported contains three tRNA genes 
corresponding to rarely used codons. Applicants submit that the prior art does not teach cells 
with three or more tRNA genes corresponding to codons that are rarely used in the host cell. 
Further, the first Buchanan Declaration states that the commercial success of the claimed 
invention is not the result of heavy promotion or advertising, noting that Stratagene spent no 
more on promotion or advertising of this product than it did on any other competent cell product 
it sells. The product also sells for a considerably higher price than non-codon-enhanced cells 
sold by the same company. Thus, the commercial success of the product is attributable not to 
heavy promotion or lower price, but to something disclosed in the patent that was not readily 
available in the prior art. 

Under J.T. Eaton, having shown the necessary correspondence between the commercial 
success and the claimed invention, Applicants are thus entitled to the presumption that the 
commercial success of the product is attributable to the claimed invention. Under these 
circumstances, also in accord with J.T. Eaton, Applicants submit that sales alone, in the absence 
of market data, are indicative of non-obviousness. In view of this, and, where, as in the instant 
case, there are no data regarding market share because there were no competing products at the 
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-time of the sales reported (itself a strong indicator of non-obviousness), Applicants submit that 
the sales figures provided are evidence of non-obviousness. 

With regard to the assertion that "A demonstration of commercial success for a couple of 
specific embodiments useful in E. coli cannot be considered as evidence of nonobviousness 
commensurate with the full, broadly claimed genus/' Applicants submit that the prior art cited is 
drawn to expression in E. coli, as is the evidence of commercial success of the claimed 
invention. Thus, the scope of the demonstrated commercial success of the claimed invention is 
directly relevant to the non-obviousness of the claimed invention over the cited prior art. Thus, 
giving proper weight to the commercial success of the embodiments sold (as discussed above), 
E. coli embodiments of the claimed invention are non-obvious. Further, if E. coli embodiments, 
for which the prior art appears to be most relevant, are not obvious, Applicants submit that there 
is no reason to conclude that embodiments encompassing other cell types, for which there is a 
lack of relevant prior art, would also be non-obvious. Applicants therefore submit that the scope 
of the invention encompassed by the commercial embodiments is sufficient to overcome the 
alleged obviousness of the claimed invention. 

In view of the above, Applicants submit that the invention of claims 1-16 and 18-44 is 
not obvious over the combination of references cited. Applicants respectfully request that the 
rejection of these claims under §103 be withdrawn. 

Applicants submit that in view of the preceding remarks and the Exhibits provided, all 
issues raised in the Office Action have been addressed herein. Applicants respectfully request 
reconsideration of the claims. 

Respectfully submitted, ^ ^ ^ ^ ^ 

Dated: September 26, 2002 ^ZZZ^f /7 ^Z^j 

Kathleen^. Williams, Ph.D. 
Registration No. 34,380 
Attorney for Applicant 
PALMER & DODGE LLP 
1 1 1 Huntington Avenue 
Telephone: (617) 239-0451 
Telecopier: (617) 227-4420 



EXHIBIT A 

214-215 Nucleic Acids Research, 1996, Vol 24, No. 1 © }9 96 Oxford University Press 

Codon usage tabulated from the international DNA 
sequence databases 

Yasukazu Nakamura, Ken-nosuke Wada 1 , Yoshiko Wada, Hirofumi Doi 2 , 
Shigehiko Kanaya 3 , Takashi Gojobori and Toshimichi Ikemura* 

National Institute of Genetics and The Graduate University for Advanced Studies, Mishima.Shizuoka 411, Japan, 
1 ATR Limited, Seika, Kyoto 619-02, Japan, 2 Fujitsu Laboratories Ltd, Chiba 261, Japan and 3 Yamagata 
University, Yonezawa, Yamagata 992, Japan 

Received August 31, 1995; Revised and Accepted October 4. 1995 



ABSTRACT 

Codon usage in 87 602 genes has been calculated 
using the nucleotide sequence data obtained from the 
GenBank Genetic Sequence Data Bank (Release 90.0; 
September 1995). The database is called the CUTG 
Database; the complete form of the database can be 
obtained by anonymous ftp from DDBJ and a part of 
the database, which lists the frequency of codon use 
in each organism, is made searchable through our 
World Wide Web server. 

SOURCE AND METHODS 

Codon usage in individual genes has been calculated using the 
nucleotide sequence data obtained from the GenBank Genetic 
Sequence Data Bank (Release 90.0; September 1995). The 
compilation of codon usage is synchronized with each major 
release of GenBank. The resulting database is called the CUTG 
database (1-5). 

In selecting protein coding sequences we relied on the 
FEATURES tables of GenBank, and only complete genes 
without unambiguous bases were used in the analysis. In 
GenBank, a group of consecutive genes whose entire region had 
been sequenced were registered under one LOCUS name. To 
distinguish the different genes belonging to a single LOCUS, the 
symbol # followed by a number is added after the LOCUS name; 
the numbers represent the order of the CDS registered in the 
FEATURES table of GenBank. When introns of a gene have not 
been completely sequenced, some of its exons are registered in 
separate entries (LOCUS) in GenBank. These exons, belonging 
to the same gene but having different LOCUS names, were 
combined into one entry and the first LOCUS name is added. 

For the biological significance of codon usage, see Ikemura (6) 
and Aota and Ikemura (7,8). 

FILES 

Files of the present database, containing codon usage of 87 602 
CDSs of 4805 species, are available by anonymous ftp from 



DDBJ. Files named as gb***.codon list the codon use in each 
gene registered in the GenBank Sequence files (gb***.seq). The 
LOCUS names given in GenBank were used to designate 
individual genes. Each LOCUS name is followed by fields of 
information extracted from the FEATURES of each CDS for 
defining each open reading frame analyzed here. The order of the 
codons in the table is the same as the previous compilation (see 
the CODONLABEL file or REFERENCES). 

To reveal the characteristics of codon use of a wide range of 
organisms, as well as viruses and organella, the frequency (per 
1000) of codon use in 461 organisms for which >20 genes are 
available was calculated by sununing up numbers of codon used. 
World Wide Web clients, such as NCSA Mosaic and Netscape, 
may be used to query this file. A user can display a codon usage 
table by clicking an anchor for selecting species or searching with 
species' name (Fig. 1). 

DISTRIBUTION AND ACCESS 

Complete form of the database is available by anonymous ftp 
from DDBJ: 

frp://fh5.nig.ac.jp/pub/db/codon/GB90. 

The file README contains the latest information on the 
database in plain text format. 

The frequencies of codon use in 461 organisms for which >20 
genes are available can be accessed on the following WWW 
server: 

http://tisun4a.lab.nig.ac.jp/codon/CUTG.html. 
Comments on the database can be sent to cutg@lab.nig.acjp by 
e-mail. 
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Figure 1. Snapshot of the CUTG home page. 
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SUMMARY 



agreemem bet^n YSC and PR] n pamcularly striking as they share six low-usage codons. All six carry the dinucleotide 
codon .are clearly av«ded m genes encoding abundant proteins for ECO, YSC and DRO. In all spec^Xei^cTS 

iTJT 0 ^ f^TT*."*™ C ° Uld * CharMtoed » - -ess of the protein oZZZ^Z 

Low codon usage * relatively insensitive to gross base composition. However, dinnclectide Jage can somctoesMuHe 
codon usage. This is particularly notable in the case of CG dinucleotides in PRI. onetimes influence 



DTOtODUCTION 

Amino acids (aa) that are represented by more than one 
codon usually do not use synonym codons equally (e.g., 
Grantham et al., 1981; Gouy and Gamier, 1982). Indeed, 
the differential use of codons is most striking and species- 
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specific and raises many interesting possibilities and con- 
cerns (Andersson and Rutland, 1990). Studies on ECO and 
YSC have shown that high abundance proteins show a 
sharp avoidance of codons that are in low usage in the 
overall gene population (Post et al„ 1979; Ikemura, 1981 ; 
1982; Bennetzen and Hall, 1982), a finding that has led to 

Abbreviationa; a» (a.4. in T«bfc»), amino «cid(«); bp, ban pair(s); DRO, 
DnxpMla mehKegesur; ECO, En&oiMa coS; PRI, primates; r, riboto- 
mal; STP, Hop codon; YSC, yeast Saakanmyces mvUee. 
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the suggestion that law-usage codons may be more difficult 
to translate (Grosjean and Fiers, 1982; Konigsberg and 
Godson, 1983; Kuriand, 1987). This suggestion is sup- 
ported by the observation that cognate tRNA abundance^ - 
roughly proportional to codon usage for most codons 
flkemura, 1985) and that rates of translation can vary with 
concentrations of charged tRNA, at least in ECO (Rojiani 
et at, 1990), Direct assays oftranslatability lend somo sup- 
port to this view (Robinson et al, 1984; Bonekamp et al 
1985; Carter etal., 1986; Hoekema et al, 1987; Sarensen' 
et al, 1989; Curran and Yarus, 1989; Chen and Inouye, 
1990). Further, expression of at least one heterologous pro^ 
tein in ECO required ohgodeoxynucieotide synthesis of the 
entire gene with high usage synonym codons of ECO, since 
message containing the naturally courting codons of the 
gene (very rich in ECO's low-usage codons) failed to sup- 
port measurable translation (Abate et al., 1990; T. Curran, 
personal communication). 

Whereas papers summarizing codon usage have 
appeared sporadically Grantham etal., 1981 

fkemura, 1985; Sharp and Li, 1986; Sharp etal., 198« : 
Wada etal., 1990), the amount of information keeps 
increasing at the rate of about 2 x I0 6 bp per year. This 
necessitates periodic revaluation and updating of codon 
usage. There is also the need for new ways of looking at 
codon usage which requires new perspectives and new com- 
puter programs. For example, Gutman and Hatfield (1989) 
have considered the problem from the perspective of fre- 
quency of appearance of codon pairs. 

For the purpose of this study and to stay within rea- 
sonable practical limits, we will confine the information 
presented in this paper to ECO, YSC, DRO, and PRL 
Mitochondrial genomes are not included in this analysis. 
Here, we shall describe different ways of estimating codon 
usage for these four classes of organisms. In addition to 
serving as an update of previous papers, a new way of 
evaluating codon usage is introduced and the relationship 
between codon usage and dinucleotide frequency is 
addressed. A major purpose of this paper is to give an 
accurate assessment (as of September 1990) of low-usage 
codons in different species. A second and more elusive goal 
is to seek explanations for the evolutionary choices that 
have been made. 



RESULTS AND DISCUSSION 

(a) Criteria for identifying low-usage codons 

We shall define codon usage as the number of times a 
codon is translated per unit time. This has not been meas- 
ured directly in vivo, so estimates of codon usage are 
obtained by indirect observations. It should be appreciated 
that codon usage is likely to be different under different 



conditions of growth. For example, under conditions of 
rapid growth when there are plenty of nutrients, the overall 
rate of protein synthesis will be maximal and the synthesis 
rof-proteins that are needed for rapid growth should be 
favored. By contrast, under conditions of nutrient Ihni. 
tation, a new set of proteins is most likely to dominate 
metabolism and the overall rate of protein synthesis would 
-be reduced (Ingraham et aL, 1983). We will describe three 
ways of estimating codon usage. Each of them has its 
advantages and drawbacks. Fortunately, they all give 
approximately the same answer as regards the hierarchy for 
*most used' and least used 1 codons within each synony- 
mous codon family. 

(1) Sums of codon appearance 

The most common way of measuring codon usage is by 
summing the number of times codons appear in the reading 
frames of the genome. This approach is summarized in 
Table I for four different types of organisms under dis- 
cussion. This method should overestimate codons used 
infrequently and underestimate codons used frequently. 
The reason for this is that, when averaging over the entire 
genome, no weight is given to the number of times different 
reading frames are used, which is reflected in the variable 
quantities of protein products. This weighting is a complex 
resultant of transcriptional efficiency, message stability, and 
translation^ efficiency. In general, this weighting is not 
precisely known but frequently it can be roughly estimated 
from the amount of gene-encoded protein product. 

(2) Absence of particular codons in genes 

A second way of estimating codon usage is to examine 
the distribution of usage in different genes. We have done 
this by scoring the number of protein reading frames in 
which a particular codon does not appear. In Table II, these 
data are directly compared for the four species under con- 
sideration. A large number indicates a narrow distribution 
for the codon in question. No attempt has been made to 
weight this method by scoring the number of times a codon 
appears in a gene. Only the presence or absence of a codon 
within a gene has been scored. This approach to measuring 
codon usage is based on the prediction that the most used 
codons should have the broadest distribution and the least 
used codons should have the narrowest distribution. The 
data in Table II are presented in a way that is most con- 
venient for making comparisons of relative usage between 
different organisms. For tins purpose, the numbers have 
been normalized to those actually observed for DRO. The 
normalization factors are given in the footnote of Table II. 

(3) Combinations of excluded codons 

The third way in which we have estimated codon usage 
starts with a similar notion to that used in the second 
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TABLE I 

A comparison of codon usagca* among four diffiwnt species* 



A. ECO: total oodons: 323039, total proteins: Q6B 




1552 (Qrsl O 

1061 (Cys) C 

859 (SIP) A 

4188 (Trp) G 



1204 (L«i) 

3135 (L«i) 

9SO (Len) 

17477 (L«) 



3133 (Pro) 
1384 (Pro 
2813 (Pro 
7714 (Pro 



3flfl6 (HU) 

3471 (HU) 

4iii (Gin) 

8589 (Gin) 



8805 (He) 
871J (He) 
1273 (lie) 
8*07 (Mat) 



7881 (Apj) U 

004ft (Aigj C 

«*9 fAlj] A 

1491 [Alf] G 



34SB (Thr) 
TS73 (Thr) 
3108 (Thr) 
4030 (Thr) 



5263 (Asa) 
7«68 fA») 
12104 (Iffi) 
3857 (I<yi) 



2580 (S»r) U 

4881 (Ser) C 

«M (Arj) A 

448 (Aty) G 



8716 (V*l) 
48K (Val 

I89e 

7072 



SSI 



5770 (Ala) 10382 (Aflp) 

7490 AU) 71 IB (Asp 

8760 (All) 14134 (Glu) 

10641 (AH) 6149 (Glu) 



B. YSC: total codons: 2410C4, total prnteips: 484 



0502 (Gly) U 

H824 (G] r » C 

2H« (Gly) a 

3110 (Gly) G 



5659 (Phe) 
4813 (Ph«) 
3857 iUu) 
7776 (Leu) 



5554 ($er 
3535 (Ser 
3722 (Ser 
1555 (Set) 



3362 (Tyr) 
M9S (Tyr 
258 (STP) 
92 (STP) 



1843 (Or*) 



2297 
981 
2840 
2003 



;l«> 

Leu) 
L«) 



800 (Qr») c 
138 (STP) A 
2«0 (Trp) G 



808) (Pro) 
1353 (Pro 
8182 (Pro 
9SB (Pro 



2977 (His) 
2017 (Hi*) 
70S8 (01 q) 
2500 (Gift) 



74M (IU) 

44M (ru) 

3054 (He) 
5130 [Met) 



5281 (Ttr 
2425 (Tir 
3703 (Thr 
1587 (Hit 



75D7 {Asa) 
8235 (Aid) 
9104 (Ljs) 
8536 (Lys) 



1801 (Ar S ) U 

487 (Ar S ) C 

531 (Ar,) A 

288 (Arg) O 

2802 [Ser) U 

1782 (Ser) 0 

mt (Ar 5 ) A 

1814 (Arj) G 



«86 (V*I) 6S84 (AU) (A«> 
3«2 (Wl) 37fl5 AU 5418 ffi) 

2280 (VH) 1S48 (At.) 4083 Gin) 



8376 (Gly) 
2 US (01 y) 
3137 (Gly) 
1238 (Gly) 



U 
C 
A 

a 



C DRO: total codow; iw&T, total pratabs: 244 




0 



S«] 1837 (Tyr) 710 

■'Sari S514 (Tyr) 1783 

Ser) 127 [STP) 80 

Ser) 67 [STP) 1888 (TVp'j 



Pro) 
Pro) 



164S Pro) 
" — Pro) 



1*04 (Hia) 1267 (Arrl 

2135 (His) 3217 Ar K ) 

im (Gin) 873 

4870 [Gtn) 908 



2001 (l[e) 
8337 lie 
882 (lie) 
2997 (Met) 



(Arsl 



Tbr) 
2824 Thr) 
Thr) 
1704 (Tar) 



2808 (Ain) 
WIS (A«il) 
1802 (Ly e ) 
W23 (LyO 



1215 

3369 (S«r) 
629 (Atb) 
777 (A^g) 



(S«r) 

(S 



1S71 (Yal) 
1958 (Vkl 

3424 (Vai 



1000 [AU) 
4797 (Ala) 
I486 (Ala) 
1714 



D. PRL toctl oodoos: total proteins: 1518 



8884 (Phc) 
U 13719 (Pba) 
M78 (Leu) 
8870 (Leo) 



6338 (Lni) 
12177 [L«i 

3871 [L*uJ 
202U (Ul) 



8382 (Pro) 6643 (HU) 

12i38 (Pro 8815 (HU1 

8007 (Pro) Sfifll (oin) 

3SQ1 (Pro) 20958 (Gin) 



8154 (II« 

1«1» (iu; 

3805 file) 
13572 (Mm) 



7778 (Thr 
13820 (Thr 
ST88 [Thr 
4M4 (Tbr 




tOIlfi (Asn) 
18569 (Asa) 
130*6 (L?«) 
31615 (Ly*) 
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u 
c 

A 
G 



V 
C 
A 

a 



3388 (A»p) 1953 (Gly) U 

3199 (A»p) 3824 (Gly) C 

2308 (Glj) (Gly) a 

8773 (Glu) 599 (Gly) C 



MM 8„ 7504 Tyr) 5882 (Qy.) U 

1JOT Ser) 11188 (Tyr) 8B22 (Qy S O 

8788 (Scr) 445 (STP) 774 (STP) A 

2«8 (Ser) 302 (STP) 8390 (Trp C 



2818 [Arg] IT 

8845 (Ar ff ] C 

8358 Ate) ' A 

8354 (Arg) G 



5?47 (S«r) y 

11S85 (S«r) G 

6088 (Arn) A 

M27 (Arif) G 



Ala) 13231 (Aap) 

Ala) 17015 (Aap) 

Ala) (Glu) 

Ala) 25451 (Glu) 



8«7 (Gly) 
15198 (Gly) 
10445 (Gly) 
10497 Gly) 



U 
C 
A 
G 



method but instead cousiders combinations of codons that 
are missing from the maximum number of genes. The 
assumption here is that combinations of low-usage codons 
should be excluded from the most active genes, because the 
more tow-usage codons present in a gene, the greater the 
potential reduction in the amount of translated product 
Although this remains to be experimentally tested in a 
rigorous fashion, there are a number of reasons to predict 
this result (see introduction). The program searches for 
combinations of codons (two or more) that are excluded 
from the maximum number of reading frames of the species 
m question. To make the computer calculations practical 
only the 20 least used codons defined by method 1 above 
were used in the pool to determine combinations of least- 



used codons. In Table III, results are presented for combi- 
nations of up to eight codons with the least favored 
incidence; the number of protein reading frames that 
exclude the combination is given at the left in the table. 
Frequently, there is little difference between first, second 
and third (not shown) choices, indicatmg a degree of 
uncertainty in giving priority to the choices. 

For ECO, there is a stepwise addition of a new low-usage 
codon at each stage as the size of the combination increases. 
Tnis is also true for YSC except after stage 7 (combinations 
containing seven codons) where QQQ is eliminate and 
AGG and ACG are simultaneously added. For DRO, a 
similar stepwise pattern of addition is seen as for ECO; The 
situation is far more complex and mtriguxng for PRI. The 
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TABLE JI 

Relative rmmbets* of prolans not osiag a particular codon in four 



species 



UUU(Ph B ) 

uucfpt*} 

UUA(L«) 

UCU (Ser) 
UCCfBw) 
UOA(5«r) 
UOG(Se) 

UAC (Tyr j 
UAA(STPj 
UAG (STP) 
UGU(Oya) 
UQC (Cys) 
UGA (STP) 
UGG (Tn>) 
CUUfLeu) 
OUCfUi) 
CUA (Lea) 

cue {Lraj 

ECU (Pro) 
CCC (Fro) 
CCA (Pro) 
CCQ(P ra ) 
CAU (His) 
CAC fffis) 
CAA (Gin) 
CAG (Gin) 
CGU (At*) 
CGO [Arff) 
CCA [Atg) 
CGG (Arg) 



EGO 


TfSG 


PRO 


17 


15 


3d 


17 


g 


1£ 


40 


g 


1S5 


37 




SB 


33 


5 


49 


43 


11 


14 


68 


50 


es 


80 


SB 


IS 


30 


is 


87 


SO 




to 


82 




117 


328 


LBS 


177 


S3 


.40 


76 


79 


89 


1» 


179 


175 


194 


43 


31 


ta 


41 


48 


48 


38 


85 


JO 


127 


25 


6ff 


9 


54 


16 


M 


2B 


54 


98 


fiB 


17 


49 


7 


27 


20 


S7 


Jl 


38 


35 


41 


41 


as 


25 


SB 


5 


52 


10 


48 


11 


10 


ST 


31 


21 




SO 


134 


144 


67 


110 


167 


72 



PRI 


iwlmn />il 






UKO 


PHI 


35 


AUU(De) 


1G 


4 


18 


38 


14 


AUC(]feJ 


10 




17 


14 

88 


110 


AUA(tt*j 


134 


&S 


84 


U 


AUG (Mm) 


1 


0 


0 


0 


2B 


ACVPUir) 


3S 


6 


50 


28 


Id 


AOOfThr) 


10 


14 


13 


S 


40 


ACAfTir) 


66 


39 


61 


27 


64 

X* 


ACOfThrj 
AAU(Ato) 


3D 
39 


A3 
1$ 


43 

25 


SO 
83 


i7a 


AAC(Am) 
AAA(Lj»] 


13 
4 


4 


11 
33 


fl 
17 


145 


AAG (Lys) 


23 


* 


9 


t 


46 


AOU (Scr) 


S5 


40 


ea 


40 


38 


ACC(S«r) 


20 


51 


18 


14 


120 


AGA (A*) 


181 


7 


05 


45 


26 


AGG (Arg) 


ISO 


56 


87 


2J 


35 


GUUfVil) 


10 


6 




as 


13 


GUC (VilJ 


S3 


li 




IS 


58 


GUA (V«l) 


18 


S3 


« 


79 


5 


CUG (V&l) 


17 


43 


16 


7 


21 


GCUfAb) 


IS 


7 


34 


10 


10 


GCC (Ala) 


18 


10 


1 


4 


34 


GGA (AU) 


11 


50 


40 


18 


73 


GCG (Ala) 


IS 


73 


38 


€4 


46 


GAU (Asp) 


u 


6 


14 


18 


21 


GACfAap) 


13 


& 


17 


5 


41 


GAA[Gh) 


7 


2 


25 


17 


4 


GAG (Gltt) 


19 


25 


7 


4 


81 


GGUfGJjr} 


13 


8 


20 


32 


47 


GGO (Gty} 


14 


34 


11 


10 


SB 


GGA(G[y) 


66 


47 


15 


is 


45 J 


ggg (car) 


43 


70 


BO 


18 




four codons at stage 4 arc eliminated at stage 5 never to 
reappear as part of the favored low-usage combination at 
higher stages. It should be noted that these Four codons are 

TABLE III 

Numbers of proteins not using s. codon or combination of cocoas a 



all of the UA type. At stage 5, all four codons ending in CG 
emerge along with one codon, CGU, carrying a CG in the 
first two positions. At all future stages, codons containing 



A. EGO total prolans: 8€fi 

714 AGO 
5J& ACQ AOA 
352 AGO AOA AUA 
255 AGG AGA CUA AHA 
30$ AGO AGA CUA AUA OGA 
150 AGO AOA CUA AUA OGA CGG 
1!0 AGG AGA CUA AOA CGA OGG CCC 
83 AGG AGA CUA AUA OGA CGG CCC (JOG 



0, DEO total protons: 244 

m VOA 
40 UUA AGA 
TO UUA AGA AUA 
55 UUA AGA GGG AUA 
10 UUA AGA GGG AUA OOG 
35 UUA AGA AUA COG ACT/ CGA 
80 CUA AGA GGG AUA CGG AGO OGA 
25 UUA AGA OQO AUA UGU OGG AGU CGA 



R YSC total protons: 484 

8SS OGG 

343 OGG CGA 

176 OGG CGA OGC 

186 CGQ CGA OGC COG 

116 CGG CGA OGC COG CUC 
35 COG CGA CGO COG COO OCO 
80 OGG CGA CGO COG CU0 GCG GGG 
72 CGG OGA CGO COG CUC GCG AOG AGO 

D. PR[ total ptctems 1518 

687 UUA 

380 UUA AUA 

£18 UUA AUA GUA 

116 UUA AUA GUA CUA 
80 UOG CGU OCG GCG AOG 
60 CGA CGU COG CGO AOG OGG 
51 UCG CGU COG OGA AOG OGC CGG 
43 UCG CGU GCG OGA AOG CGO CGG GCG 



indicate! codons are absent from the listed number of protein* on the same line, for «ach specie*. 



£0 EO-d3S-£0 frt yj^pn) 



a GO in either theirfirst two orlast two positions dominate. 
At stage 8, we find all of the codons that cany the dinuclec- 

tide sequence CG. . . 

From the standpoint of camput^progranmung this third 
method of picking low-usage codons is the most complex. 
However, with the program in hand it is easy to apply. For 
purposes of comparison, in most of the remainder of this 
paper we will use combinations of eight low-usage codons 
determined by the third method. These codons for the dif- 
ferent species are presented together in Table IV. Each 
species has a unique combination of eight codons, but all 
species contain the Argcodons CGA and CGG. The agree- 
ment between YSC and PRI is particularly striking as they 
share six low-usage codons. All six carry the dinucieotide 
sequence CG. It is appropriate to mention at this point that 
CG is the dinucleotide sequence that is most avoided in 
PRL This may be related to the fact that the C in a CG 
sequence is susceptible to methyiation (eg,, see Razin and 
Riggs, i 980) and is therefore reserved for special situations. 
In YSC there is no methyiation of C residues in CG 
sequences. Despite this we shall see below that this dinucle- 
otide is also the most unpopular in YSC. 

(b) Large codon families are more likely to contain low- 
usage codons as modulators of expression 

There is reason to believe that* if codon usage plays a role 
in modulating gene expression, it is more likely to do so in 
the larger codon families that contain three of more synony- 
mous codons. Two aa, Trp and Met, are represented by 

TABLE IV 

Low-usage codons' 



ECO m 




... DRO 


PRI 




AGG 


AGG 








AGA 




AGA 




Atr 


AOA 




AUA 




He 


CUA 








Lett 


CGA 


CGA 


CGA 


CGA 




CGG 


CGG 


CGG 


CGG 


A*S 


CCC 








Pre 


UCG 






UCG 


Ser 




CGC 




CGC 






CCG 




COG 


Pro 




cue 






Leu 




GCG 




GCG 


At. 




AOG 




ACQ 


Thr 






UUA 




Lea 






GGG 




GSy 






AGU 




Ser 






UGU 












CGU 









The right least used codon* for each species, as determined by the 
'combinations of excluded codons* method (see text and Table HI). 
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single codons. The two-codon families include Phe, Tyr, 
His, Gin, Asn, Lys, A-tp, Gin and Cys. In most of these 
cases, both codons are believed to be capable of recognition 
by a single cognate tRNA (or anticodon) in ECO (Sprinzl 
et al, 1987, and references therein). This does not exclude 
the possibflity that some of these codons are used as 
modulators of gene expression for example, because of the 
different strengths of anticodon-codon interactions 
(Grosjean et aL, 1978; Yams et al, 1986). In the cases of 
aU the remaining codon families, where three, four or six 
codons are involved, at least two tRNAs arc found for every 
aa. Most frequently in ECO and YSC (sufficient informa- 
tion is not available for DRO or PRI), as indicated above, 
there is a correlation between the abundance of the tRNA 
and the abundance of the codon (Ikemura, 1981; 1982). 
There are notable exceptions as in the case of the CGA 
codon for Arg. This codon is recognized in ECO by the 
same tRNA that recognizes the two high-usage codons 
CGU and CGC (Murao et al., 1972). It seems reasonable 
that we should focus our attention on the major codon 
families to gain an appreciation of the significance of low- 
usage codons. 

In Table V, we have compared the three methods for 
assessment of codon usage for the nine major codon 
families. In this table, codons representing the same aa are 
grouped together. For each species, the % is given for 
codon usage determined by abundance in the genome as in 
Table I. In parentheses, the normalized number is given of 
proteins of the species that do not contain the codon as in 
Table II. A large number in parentheses should correspond 
to a small % if the two methods are consistent Inspection 
of Table V indicates that the two methods are in quite good 
agreement for most aa. With only five exceptions (Vai and 
Ala in ECO, Val in YSC, and Ser and Thr in DRO), the 
least abundant codon is excluded from the most proteins. 
With only five exceptions {Val, Ala and Gly in ECO, and 
Sex and Arg in PRI), the most abundant codon within the 
aa codon family is excluded from the lowest numbers of 
proteins ; that is, it has the broadest distribution. La Table V, 
the % use of codons designated as low-usage codons by the 
combination of exemded-codons methods described above 
(see Tables HI and IV) are imderlined. Only in DRO do we 
have a codon designated by this latter method that does not 
appear in Table V (the UGU codon for Cys). Simple 
inspection of Table V suggests that additional codons might 
be designated as low-usage codons. However, we have 
chosen to focus our attention on eight codons from each 
species which we think are most likely to represent horn fide 
low-usage codons. If we were to attempt to expand this list 
we would run the risk of selecting codons that may not 
possess the most important properties oflow-usage codons. 
Until such time as reliable quantitative estimates on rate of 
translatabflity can be given, we prefer to make comparisons 
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with a smaller subset of low-usage codons that we can he 
relatively sure of. 



TABLE Y 

Synonymous codon usage far major codon families 





ECO 


YSC 


DRO 


PRI 


UUA 

, cuu 

^ OUC 
OUA 
CUG 


ll a (46) b 
11 (37) 
10 (41) 
10 (30) 
3 (127) 
55 (9) 


27 (8) 
36 (4) 
11 U8) 

5 (88) 
13 (25) 

9 (54) 


6 C (125) 
18 (28) 

IS (30) 
8 {CO) 
44 (15) 


6 (110) 
11 (SB) 

1 1 f 7Kl 
11 [«1 J 

SI (12) 

7 (50) 
45 (5) 


AUU 
Ec AUC 
AUA 


47 (10) 

48 (10) 
I (124) 


50 f4) 
20 (8) 
20 (66) 


53 (17) 
14 (84) 


53 (H) 
13 (88) 


GUV 

, , . GUC 
Val 

GUA 

cue 


20 (10) 

17 (19) 
34 (17) 


44 («) 
25 (l2) 
16 (52) 
15 («) 


10 (22) 

9ft flOl 

9 (68) 

4a (is) 


18 (38) 
Z5 (IB) 
9 (79) 
4fl (7) 


UCU 

uce 

Ser UCA 
UCG 

AGU 

AGO 


18 (33) 
17 (42) 

12 (68) 
14 ieo\ 

13 (05) 
26 (20) 


31 (5) 
18 (11) 
IS (30) 
s tori 

IS (40) 
9 (51) 


8 {49) 
26 (U( 

s (es) 

22 (38) 
12 (59) 
24 (18) 


18 (28) 
24 (10) 
13 (40) 
6 (04) 
13 (40) 
36 (U) 


ecu 

CCA 
CCG 


IS (58) 
10 (88) 
10 (48) 
56 (20) 


29 (36) 
13 (86) 
49 (7) 
1 (W) 


U (54) 
36 (17) 

23 (a?) 

30 (31) 


27 (21) 
35 (10) 

28 (34) 
11 (73) 


ACU 
t . f ACC 
Thr ACA 

ACG 


20 (29) 
46 (16) 
12 (6S) 
23 (33) 


3S (S) 
24 (14) 
26 (38) 
11 (B3) 


16 (£0) 
42 (15) 

17 (81) 
25 (4«) 


S3 (28) 
40 (0) 
25 (27] 
12 (AO) 


GCU 

GCA 
GOG 


19 (14) 
24 (18) 
23 (11) 
36 (IS) 


44 (7) 
24 (10) 
24 (30) 
8 (73) 


19 (24) 
49 (3) 
15 (40) 
17 (28) 


28 (10) 
42 (4) 
20 (18) 
U (84) 


CGU 
CGC 

a™ cga 

8 CGG 
AGA 
AGG 


43 (19) 
37 (21) 
5 (124) 
8 (110) 
7 (Ifll) 
2 (180) 


17 (57) 

4 (128) 

5 (144) 
2 (187) 

54 (7) 
17 (58) 


19 (31) 
S3 (20) 

13 (67) 

14 (72) 
0 (05) 

13 (87) 


9 (81) 
22 (47) 
10 (68) 

20 (45) 
18 (45) 

21 (23) 


GGU 

V GCA 
GGG 


38 (13) 
40 (14) 
9 (88) 
14 (49) 


80 (B) 
15 (34) 
15 (47) 
• (TO) 1 


fiS (20) 
44 (11) 
28 (15) 
I (88) 


15 (32) 
38 (10) 
24 (35) 
26 (18) 



* Kumbers in each column indicate V, use of synonymous codon (derived 
from Table I), For example, in ECO we counted a total of 31 832 Leu 
codons of all types m Table L Of these, 17477 were CUG codons; tbu», 
the % use of CUG is 17477 -h31832 x 100 - 55% (in sixth line). 
b Numbers tn parentheses indicate normalized number of pfotdns lack- 
ing the codon (takes directly from Table II) 
° Underlined are numbers of low usage codetta determined by 'combi- 
nations of excluded codons' method (see Tables 10 and IV). 



(c) Low-usage codons are preferentially arotded in genes 
for abundant proteins In ECO, YSC and DRO ; 

Implicit in the consideration of low-usage codons is the 
notion that such codons are more slowly translated and 
therefore avoided in genes for proteins required in large 
amounts. This notion is supported by a comparison of 
proteins containing small and large numbers of low-usage 
codons (e.g., Dcemura, 1985). For combinations of eight 
excluded codons, the distribution of proteins according to 
the number of low-usage codons they contain is given in 
Table VI. 

It is informative to inspect those proteins that fall into the 
4 zero class', which uses none of the eight designated low- 
usage codons, and perhaps equally informative to inspect 
those proteins at the other extreme that use an abnormally 
high concentration of these low-usage codons. In spite of 
the fact that the precise molar amounts of each of these 
proteins are not available, the trends are unmistakable. 
Thus for the 93 proteins that arc in the zero class for ECO, 
35 r-proteins are found. Most of the remainder are major 
proteins involved in protein synthesis initiation and elon- 
gation, enzymes involved in intermediary aa metabolism 
and carbohydrate metabolism and the major outer mem- 
brane proteins, OmpR and OmpF. In YSC the zero class 
also abounds with r-proteins. The zero class in YSC also 
contains some histones, enzymes involved in aa biosynthe- 
sis and fermentation, actin, alcohol dehydrogenase and 
several ubiquitins. Of the 25 proteins in the DRO zero class 
the same trend continues with a strong representation of 
major proteins or at least proteins that are major in some 
tissues some of the time. Thus, we see some r-proteins, 
metallothionein, myosin, major heat-shock proteins, alco- 
hol dehydrogenase I, cytochrome c, tropomyosin and a 
tubulin appearing in the zero class. Only in PEJ is the trend 
unclear. Thus, we do not find r-proteras, histones or major 
enzymes in the zero class. Rather we find some globins, 
interferons, interleukins andmetaHothionein. It may be that 
these proteins are needed in large amounts in special situa- 



TABLE VI 

Distributions of proteins with respect to the percentage of low-usage 
cadons they contain 



% oi low codons 


ECO 


YSC 


DRO 


PRI 


0% 




72 


25 


43 




uz 


168 


48 


307 


ftbore $96 to 6% 




100 


04 


559 


Above $% to 0% 


102 


46 


48 


882 


abonrs 0% to 12% 


30 


5 


16 


155 


store 12% to 15% 


12 


2 


8 


73 


abov* 15% 


10 


1 


ft 


30 



' Numbers refer to number of proteins containing the indicated % cflow 
usage codons. 



taranHtanfin it is: beneficial far their translation to 
avoid tte d e sig fla t^::Jow.^a^ codons. Alternately it 
may be that translate p| ay5 a-smallex role in 'the 
selection of low-usage codons in PRI than it docs in the 
other species considered here. 

(d) Proteins that contain a high percentage of low-usage 
codons ia their genes belong to classes where an excess of 
the protein could be detrimental 

It has been argued that only high-usage codons are 
selected in abundant proteins, and there is no selection for 
low-usage codons in low^ression-level proteins (Sharp 
and Li, 1986; Andersson and Kurland, 1990). Never 
thdess^ an examination of the trends seen in classes with 
high frequency of low-usage codons raises the possibility of 
select™ for low-usage codons. For example, reading 
frames found on transposable elements abound in classes 
with high frequency of low-usage codons. Indeed, trans- 
position of the YSC Tyl transposon has been reported to 
be regulated by the concentration of the cognate tRNA for 
^Lf7 SC$ low - usa S e codons, AGG (Xu and Boeke, 
1990). It seems likely that transposition would be under 
very tight control in most species. The remainder of the 
proteins found in the ECO list with greater than 9% low- 
usage codons include some toxins and lesser known regula- 
tory proteins. YSC in the 6% or greater low-usage class 
contains several regulatory protein genes and nuclear genes 
that encode products for mitochondria. The DRO proteins 
with greater than 9% low-usage codons are most notable 
for the large number of transposable element genes. They 
also contain the Shaker genes which are believed to be 
essential for potassium channels in the nervous system 
(Pongs et al, 1988). The PRI with greater than 12% low- 
usage codons abound in proto-oncogenes, growth factor 
genes, several hormone genes and, strangely enough, several 
histone genes. The finding of histone genes in the class with 
high frequency of low-usage codons in PRI contrasts with 
the findings in YSC where several histone genes are found 
in the zero class. It may be argued that primate cells are held 
m check by an abundance of low codon usage genes includ- 
ing those likely to lead to rapid growth. Rapid uncontrolled 
growth often spells disaster in PRI, in contrast to unicellular 
organisms, and our findings can be rationalized in this way. 
Alternately, as noted above, low-usage codon s may not play 
the same role in PRI as they do in the other species dis- 
cussed hers. 

(e) Codon usage by 'zero class 1 proteins is more selective 
loan for arerage proteins 

It has been pointed out that proteins in the abundant 
class have a more restricted codoa usage* We have seen that 
for ECO, YSC and DRO, the zero class also correlates 
reasonably well with proteins found in the most abundant 




group. We have engaged m the reciprocal process of select 
ing the abundant proteins first and then detennining tforfr 
codon usage from the available information in GenBank 
(results not shown). This reciprocal approach is similar to 
that taken by Sharp and Li (1987), in which these authors 
used abundant proteins in ECO and YSC to determine a 
'Codon Adaptation Index*. This approach gives very similar 
- results for ECO and YSC, but is more difficult to apply m 
dealing with organisms that have differentiated cells. Our 
approach of selecting combinations of excluded codons is 
more systematic and subject to computer analysis with a 
minimum of preparation. 

Codon usage for zero class proteins is compared with 
codon usages for all proteins in Table VII. Only data for the 
major codon families are presented. It can be seen that the 
most used codons are usually the same in both cases 
Exceptions (a total of eight; marked with asterisks) are 
usually close calls. In many cases for ECO, YSC, and DRO 
but not PRI, the differences between the most used codons 
and other codons are more extreme. Indeed, there are more 
codons in the low-usage group (5% or less) in the highly 
restricted zero class collection: nine additional in the case 
of ECO, 14 additional in the case of YSC eight additional 
in the case of DRO, but only one additional in the case of 
PRI. Many of these additional codons in the 5%, or less 
groupings may be low-usage codons in the sense that they 
may translate more slowly under some or all conditions of 
growth. This would be a reasonable explanation for their 
being more scarce in proteins that are synthesized in larger 
amounts. To decide this, as in all cases, actual measure- 
ments of translation rates will have to be made for each of 
the codons individually. 

{f) Choices of low-usage codons are relatively insensitive to 
gross base composition 

To begin a consideration of the origin of varying low- 
usage codons in different species, we might first examine the 
relationship between codon usage and base frequencies in 
reading frames. Hiis information is presented in Table VIII 
as the ratio of the observed number of codons vs. the 
expected number of codons calculated from base fre- 
quencies within reading frames on the coding strand. The 
most relevant relationships are for synonym codons. 

A ratio greater than one indicates a codon that appears 
at a higher-than-expected frequency based on the observed 
base composition of the reading frames. Similarly, a ratio 
of less than one indicates a codon that appears al lower- 
than-expected frequency. In almost all cases, the numbers 
presented in Table VUI reflect the absolute numbers pres- " 
ented in Table I. For example, the ratios of 0.21 and 3.43 
for the ECO Leu codons CUA and CUG in Table VUI 
reflect the much lower (960) and much higher (17 477) num- 
bers for these codons in Table L Thus, the extreme numbers 



TABLE VII ... 

Comparison' of codon mage 3 for all protons (T) b and for ^cxb-clssa'- 
proteins (Z)° for major codan frpifliff fl 



* Numbers are % of listed codons for the same aa. The number* for each 
aa add to 100%. Asterisks radicate cam where mo&t-tued codon is 
different in zero class proteins. 
b Values taken directly from Table V. 

c 'Zero class proteins' are those proteins that contain no residues 
encoded by any of the designated eight low u&age oodons lifted in 
Table W. 

observed for these codons cannot be explained by gross 
base compositions of the reading frames. This does not 
mean that gross base compositions have no influence on 
differential codon usage. For example, consider the case of 
the Lyji, codons AAA and AAG in YSC Table IB shows 
that there are more AAA than AAG codons (9104. vs. 



8536). Nevertheless, AAA codons are closer to the expected 
ratio than AAG codons (1.18 vs. 1.66), raising the possibin- 
" ty^at the more favored base composition of the AAA 
codon influences the higher frequency of usage of this 
codon. YSC probably have more examples like this than the 
other species we are considering here because the base 
frequencies for YSC differ most from the equimolar value 
(cf. top of Table VHI,A through D), Analyses of species 
with more extreme base compositions would be interesting 
in this regard. Our main conclusion from Table VIII is that 
low-usage codon choices are only influenced in a minor way 
by the gross base composition. Other workers have 
reported analyses indicating that overall codon usage 
patterns are influenced by gross base compositions (e.g., 
Bibb et al., 1984; Osawa and Jukes, 1988); however, these 
analyses primarily reflect codon usage patterns of average 
and high-usage codons. 

<g) Low codon usage appears to be influenced by dimicle- 
otide usage in some cases 

In the previous section we saw that the choice of low- 
usage codons in the four groups of organisms we chose to 
consider is not influenced to any appreciable extent by the 
gross base composition. In this section we consider a 
related possibility that the choice of low-usage codons is 
influenced by dinucleotide preferences. It has been argued 
that dinucleotide preferences govern to a large extent codon 
choices in eukaryotes (Nussinov, 1981; Alff-Steinberger, 
1987). Dinucleotide frequencies for the different species arc 
presented in Table IX, together with the ratio of the 
observed-to-expected frequencies based on gross base com- 
position. If there is a strong bias against the use of certain 
dinucleotide sequences it should show up in this ratio. 
Inspection of Table IX shows seven cases (underlined) 
where this ratio is 0.73 or less. The UA dinucleotide ratio 
is low for all species (0.71 ECO; 0.69 YSC; 0.58 DRO; 
0.53 PRI), the CG ratio is low for both YSC (0.72) and PRI 
(0.48) and the AG ratio is low for ECO (0.73). As was seen 
in Table IV, ECO contains two low-usage codons, AGG 
and AGA, with the AG sequence, and two low-usage 
codons, CUA and AUA, with the UA sequence. YSC 
contains six low-usage codons, CGA, CGG, CGC, CCG, 
GCG, ACG, with the CG sequence. DRO contains two 
low-usage codons, AUA and UUA, with the UA sequence. 
Finally, all eight of the low-usage codons in PRI contain the 
CG sequence. On the basis of these correlations alone we 
must consider the proposition that there is some unknown 
pressure or pressures that cause certain dinucleotide 
sequences to be underrepresented which in turn results in 
an unden-epresentatton of the related codon or codons. 
However, in the process of attempting such an analysis we 
must be careful to distinguish cause and effect. Therefore, 
we must try to determine whether low dinucleotide fre- 
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quency is the cause of lowcodon frequency or the other way 
around. 

Let us first consider the AG case for ECO. The two Arg 
codons AGA and AGO are both low-usage codons in 
ECO. If we examine the AGX box in which they occur 
(Table I), we find that this box is shared by two Ser codons 
and two Arg codons. 1 The Ser codons are well represented 



compared to other Ser codons. Based on this comparison 
alone we cannot make a case for saying that these low- 
usage codons for Arg in ECO are the result of the avoidance 
of the AG sequence. Other factors must be involved. 

Next let us consider the UA case for DRO. Two codons, 
AUA for He and UUA for Leu, have been designated as 
low-usage codons. If we look at the other codons contain- 
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• Numbers to" the left refer to totals for all reading frames. 

* Numbera in parentheses refer to ratios of observed over expected where expected is calculated from gross base composition of the reading frames 
Ratios of 0.73 or less are underlined. As an example, for the GA dinudcotide in ECO, the expected number would be calculated as follows. The probability 
that any position will contain a G » 0-273M, or an A is 0.24595 (see footnote to Table VIII for derivation of thr« values), The probability of finding 
tha dinuckfliide is the product of these two individual probabilities (027390 x € .24595 = 0.06737). Thus, out of 965333 diaueleotides the expected 
number of GA dinncleotides is 65034 (0.06737 x 965333). The oh«erved number of GA dimiclcotides was 63548. Therefore, the ratio of observed over 
expected Is 63548 -!- 65034 = 0.98. 



ing UA (see Table I) we see that these are low, even though 
they have not been designated as low-usage codons. The 
other box containing the UA sequence contains the two Tyr 
codons and two of the stop codons. Since stop codons are 
always underrepresented this will be a constant factor in 
reducing the UA frequency. The same is true for other 
species. If we look at the other species we find that there 
is a tendency for the NUA codons to be underrepresented 
in the family boxes (except for UUA and CUA in YSC). 
Whether the UA sequence is influencing the codon usage or 
vice versa is hard to say. 

To get some further indication on this point we may turn 
to the dinucleotide frequencies in the intron regions. These 
are recorded for YSC, DRO> and PRI in Table X for the 
UA and CG sequences. In all cases the UA dinucleotide 
ratio is higher for the introns than for the exons. Thus when 
the coding pressure is lifted, as in the noncoding regions of 
the introns, the UA sequence gravitates towards the statisti- 
cally most probable ratio of 1.00. This comparison clearly 
favors the argument that the low UA dinucleotide frequency 
in the coding regions is probably caused by the coding 
pressure. 

The CG sequence is associated with six low-usage 



TABLE X 

Ratio of observed to expected frequences of UA and CG duwcleotides 
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1 Extra ratios are taken from Table DC (b). 

b Intron ratio* are computed from 7582 dinncleotides for YSC, 56497 
dinncleotides fox DRO, and 6S6 152 dinncleotides for PRL The calcula- 
Uom is similar to that of the exon ratios (see Table DC, footnote b), 
however, the gross base composition in this case is taken from the overall 
base composition, not just the base composition in reading frames. 

codons in YSC and all eight in PRI. Since in both of these 
cases most of the designated low-usage codons contain 
most of the dinucleotide sequences in question we must 
carefully scrutinize the possibility that the low value for the 
CG sequence results from the low frequency of codons 
using this sequence. 



, Inspwtioii;of_tiie intron ..frequencies, for YSC and PRI 
gives different indications for the CG frequency. Thus in 
. YSC the lre^uency -lrf CG-moves closer tb the expected 
(0.72-vt;. 0.85X-whereas-for-PRr it moves even further from 
the expected (0.48 vs. 0 : 29); Similarly, in YSC the over- 
lapping intercodon C/G dinucleotide in . coding sequences 
also moves closer to the expected, while in PRI the inter- 
codon C/G is found at much lower frequency than expected 
(data not shown). It would be hazardous to draw any firm 
conclusions from this comparison. However, taken at face 
value this would argue that the low codon usage in PRI is 
dominated by considerations of avoidance of the CG 
sequence. A possible reason for this avoidance, as already 
indicated, is that the CG sequence is a site for methylation 
in PRI (Razin and Riggs, 1 980). In YSQ it remains possible 
that coding pressure influences the dinucleotide frequencies 
in the coding regions. 

(h) Other factors influencing the selection of low-usage 
codons 

Shepherd (1981) noticed that the coding sequences of 
most reading frames have a bias for the sequence RNY 
(R = purine; Y = pyrimidine; and N = purine or pyrimi- 
dioe), We have done extensive analysis of this and the 
results (unpublished observations) support the view of 
Shepherd ( 1981) which was based on a much smaller data 
base. It is notable that of the designated low-usage codons 
(Table IV) there is only one example of a RNY sequence. 
This correlates with the Shepherd view and suggests that 
the codons selected as low-usage codons in various species 
may have evolved from less popular sequence arrange- 
ments. 

(i) Conclusions 

(i) Wc have presented new approaches to identify low- 
usage codons in a reliable fashion. (2) We have been able 
to assign with reasonable confidence (with the possible 
exception of PRI) up to eight of the lowest-usage codons in 
several organisms (Table IV). (3) Gross base composition 
and dinucleotide frequencies in general cannot explain 
choices of low-usage codons; however, dinucleotide usage 
does show some influence on codon usage in PRL {4) Low- 
usage codons are clearly avoided in abundant proteins; 
those proteins containing a high % of tow-usage codons are 
generally cases where an excess of protein could be detri- 
mental. (J) In a subsequent paper, we shall propose a 
model by which low-usage codons may affect translation 
rates. Also, a more detailed review of our data on codon 
usage in primates has recently been published (Zhang and 
Zubay, 1991) 



ACKNOWLEDGEMENTS 

We acknowledge support by the American Cancer 
Society, grant No, MV-313A to G.Z., and the National 
Institutes of Health, grants Nos, GM277U and 2 S07 
RR05393 to R.G.; the computer resource is supported by 
the National Institutes of Health, grant No. 2P41 
RR00442-20. 



REFERENCES 

Abate, C, Luk, D., Genes, R., Rauscher III, FJ, and Curran, X: Expres- 
sion and purification of the leucine zipper and DNA-binding domains 
of Fox and Jim: both Fos and Jim contact DNA directly. Proc NatL 
Acad So. USA $7 (1990) 1032-1036. 
Alff-Steinberger, C: Codon usage in Homo sapiens: evidence for a coding 
pattern on the aoncoding strand and evolutionary implications of 
dinucteotide discrimination. J. Theor. BioL 124 (1987) 89-95. 
Andemon, S.G.E. md Kurfend, CG.: Codon preference* io freezing 

microorganisms. Microbiol Rev. 54 (1990) 198-21(1 
Bibb, MJ., Ffodlay, P.R. and Johnson, M.Wj The relationship between 
base composition and codon usage in bacterial genes and its use far 
the simple and reliable identification of protein-coding sequences 
Gene 30 (WW) 157-166. 
Bennet2ai,iX. and Hall, B.D.: Codon selection in yeast J. Biol. Chem 

257 (1982) 3026-3031. 
Bonekamp, R, Andersen, H.D., Christens en, T. and Jensen, KJv 
Codon-defined ribojomal pausing in Escherichia caff detected by using 
the pyrE attenuator to probe the coupling between transcription and 
translation. Nucleic Acids Res. 13 (1985) 41 13-4123. 
Carter, P.W., Bartkus, JM. and Calvo, J.Mj Transcription attenuation 
in Scimonctia typhmurwm: the significance of rare leucine codons in 
the leu leader. Proa Nail. Acad, Sci USA 33 (1936) 8127-8131. 
Chen. G.-FT. and Inouye, M.; Suppression of the negative effect of minor 
arginuie codons on gene expression; preferential usage of minor 
codons within the first 25 codons of the Escherichia coS genes. Nucleic 
Acids Res. 18 (1990) 1465-1473. 
Curran, J.K and Yarns, M.; Rates of aa-iRNA selection at 29 sense 

codons in wo. J. MoL Biol. 209 (1989) 65-77. 
Gouy, M. and Gautier, C: Codon UBage in bacteria: correlation with gene 

eotpressivity. Nucleic Acids Res. 10 (1982) 7055-7074. 
Grantham, rL, Gaotier, C, Gouy, ML, Jacobzone, M and Mercier, R.: 
Codon catalog usage is a genome strategy modulated for gene expres- 
sivity. Nucleic Acids Res, 9 (1981) r43-r74. 
Grnsjeaa, HJ„ De Henau, S- and Crothers, D.M.; On the physical basis 
for ambiguity in genetic coding interactions. Proc. NatL Acad. Sea. 
USA 75 (1978) 610-614. 
Grosjean, H. and Fiers, W_- Preferential codon usage in pfokaryotic 
genes: the optimal codoa-aatioodon interaction energy and the selec- 
tive codon usage in efficiently expressed genes. Gene 18 (1982) 
199-209. 

Outroan, G.A. and Hatfield, G-W.: Nonrandom utilization of codon pairs 
hi Escherichia co£ Proc. Nail Acad. Sci. USA 86 (1989) 3699-3703. 

Hoeietna, A., Kastelein, RA-. Yasser . M, and De Boer, HA.: Codon 
replacement in the PGK! gene of Sacdmrcmyces ctrevltiae: experi- 
mental approach to study the role of biased codon usage in gene 
expression: MoL Cell BioL 7 (1987) 2914-2924, 

Trwnrrra, Correlation between the abundance of Escherichia cdl trans- 
fer RNAa and the occ urr ence of the respective codons in its protein 




fdOS-£0 aJ-das-£Q £6lU?.n \ 

72 



genes: a proposal for a synonymous codtm choice that is optimal for 
the E. call tnmslational system. J. MoL Biol. 151 (1981) 389-409. 
neemurs, T.: CorrcUtioa between the abundance of yeast transfer RNAs 
and the occurrence of the respective codons in proton genca. J. MoL 

BioL 158 (1982) 573-597. 

Ikeoiura, T.: Codon usage and tRNA content in unicellular and multi- 
cellular organisms. MoL BioL EvaL 2 (1985) 13-34 
rn g r n ham, J.L., Maalae, O. and Neidhardt, F.C: Growth rate as a 

variable, In; Growth of the Bacterial Cell. Sinauer Associates, 

Sunderland, MA, 1983, pp. 267-315. 
Kocigsberg, W. and Godson, G.N.: Evidence for use of rare codons in 

the dtiaG gene and other regulatory genes of Escherichia coH Proc. 

NatL Acad Sd USA 80 (1983) 687-691. 
Kurland, CG.: Strategies for efficiency and accuracy in gene expression, 

1. The major codon preference: a growth optimization strategy. 

Trends Biochem. ScL 12 (1987) 126-128. 
Murao, JCTanabe, T., Iihii, F.„ Namiki, M. and Nishimura, S.: Primary 

sequence of argmine transfer RNA from Escherichia coiL Biocftem. 

Biophys. Res, Common. 47 (1972) 1332-1337. 
Nussinov, R.: Eukaryotic dinucleotide preference rules and their implica- 
tions for degenerate codon usage. J. MoL BioL 149 (1981) 125-131. 
Osawa, S. and Jukes, T.H.: Evolution of the genetic code as affected by 

anticodon content. Trends Genet. 4 (1988) 191-198. 
Pongs. 0., Kecskenj&thy, K, Mueller, R_, Krah-Jentgcns, I., Baumann, 

A.. Kiltz, H.H., Canal, L, Llamarrare*, S. and Ferrus, A.: Shaker 

encodes a family of putative potassium channel proteins in the 

nervous system of Drosophila. EMBO J. 7 (1988) 1087-1096. 
Post, L.E., Strycharz, G.D., Nomura, M. ( Lewis, H. and Dennis, P.P.: 

Nucleotide sequence of the ribosomal protein gene cluster adjacent 

to the gene for RNA polymerase subunit J? in E. cofi. Proc NatL 

Acad. Sci. USA 76 (1979) 1697-1701. 
Razm, A. and Rfggi, A.D.: DNA methylation and gene function. Science 

210 (1930) 604-610. 
Robinson, M., Lffley, Emtagc, J.S., Yarranton, G., Stephens. P., 

MiUican, A., Eaton, M. and Humphreys, G.: Codon usage can affect 

efficiency of translation of genes vaEschenchiu cati. Nucleic Acids Res. 

12 (1984) 6663-6671. 



RcrpanL M.V., Jakubowski, H. and Goldman, E.: Reladonship between 
_ protein synthesis and concentrations of charged and uncharged 
.. . itRNA T * p in Escherichia coft Proc NatL Acad Sd. USA 87 (1990) 
..1511-1515. 

Sharp, P.M. and Li, W.-tL: Codon usage in regulatory genes m 
Escherichia cofi does not reflect selection for 'rare 1 codons. Nucleic 
Acid* Res. 14 (1986) 7737-7749. 

Sharp, PAL and LL W,-H« The codon adaptation index - a measure of 
directional synonymous codon usage bias, and its potential applica- 
tion*. Nucleic Acids Res. 15 (1987) 1281-1295. 

Sharp, PM., Cbwe, E., Higgins, D.G., Shields, D.C, Wolfe, K.H. and 
Wright, F.: Codon usage patterns in Escherichia cotf, Bacilka subfitis, 
Saocharomyces ceretaiae, Sdtistuaccharomycex pombe f Drasaphila 
meianogaster and Homo sapiens: a review of the considerable within, 
species diversity. Nucleic Adds Res. 16 (1988) 8207-8211. 

Shepherd, J.GW.: Method to determine the reading frame of a protein 
from the purme/pyrimiduie genome sequence and its possible evolu- 
tionary justification. Proc. NatL Acad. ScL USA 78 (1981) 
1596-1600. 

Serensen, M.A., Kuriond, CG- and Federsen, S.: Codon usage deter- 
mines translation rate in Escherichia coll J. Mol. BioL 207 (1989) 
365-377. 

S priori, M., Hartmann, T. t Meissner, F., Moll, J. and Vorderwulbecke. 

T,: Compilation of tRNA sequences and sequences of tRNA genes. 

Nucleic Acids Res. 15 (1987) rS3-r!88. 
Wada, K.-N., Aota, S.-L, Tsuchiya, R., Ishibashi, F M Gojobori, T, and 

Ikemura, T.: Codon usage tabulated from the GenBank genetic 

sequence data. Nucleic Acids Res. 18, Suppl. (1990) 2367-2411. 
Xn, H. and Boeke, J.D,: Host genes that influence transposition in yeast; 

the abundance of a rare tRNA regulates Tyl transposition frequency. 

Proc. NatL Acad. Sci. USA 87 (1990) 8360-8364. 
Yams, M., dine, S.W., Wier, Breeden, L and Thompson, R.C: 

Actions of the anttcodon arm in translation on the phototypes of 

RNA mutants. J. MoL BioL 192 (1986) 235-255. 
Zhang, 5. and Zubay, G.: The peculiar nature of codon usage in primates. 

In: Setlow, J.K. (£d-). Genetic Engineering, Vol. 13, 1991 T pp. 73-1 13. 




-08-29 

fEBS 15257 



17:05 



EXHIBIT C 



812-772-0061 



3^ 



P 62 

FEBS Utters 262 (1995] 1^ 



Minireview 

^Differential codon usage: a safeguard against inappropriate 

of specialized genes? 

Milton H. Saier, Jr.* 

beptmmmofBtolaty. University of California at San Diego, La Jolk, CA 920934116. USA 
R ««VBd 2 January revised version received 14 February 1995 



expression 



: Abstract Recent *ork has suggested that rare codora are some- 
ftD^ed for the regulation of specialized e*ne expression fa 
bacteria. Moreover, the cellular levels of certain tRNAs may 
flictonrte with growth conditions. Evidence fanpHcatfoe such 
^S^J" fbe oontwl of photosynthesis in Rkadohactcr, 
whwtogaresia maostri&uu, spoliation in Sireptomy^ »od 
tofarfcl pha^ wrkrion in Recti is Summarized. Rfa 
Oat such mechanisms will prove applicable to (be control of 
it^ns additional speciaHied functions, and that the empirical 
(cob for testing this possibility are currently available. 

. Key words: Codon usage; Bacterium; Translation; Gmc 
^je^on; Spoliation; Photosynthesis; Solventogpncsis; 



1. Intradoction , 

Id 1 989, Brinkman ct at noted tbat eukaryotie proteins such 
as the himan tistuc type plasminogen activator, prourokinase 
and the gp41 pjotdn of HIV, which have a high content of rare 
todens in their respective genes, are poorly expressed in R eoti 
lij. Moreover, induction of the expression of any one of these 
heterologous, pksmid-encoded genes was found to inhibit cell 
^vision and cause pjasmid instability. Most remarkably, when 
ftcbactena were simultaneously provided with a plasmid bear- 
mg the dTuYgcnc, encoding a rare tRNA (tRNA^ AGO ), pro- 
duction of the eukaryotie proteins wag increased while plasmid 
aabtlity and ceil viability improved (I], 

WhiJe these observations were of considerable practical sis- 
niflcanoe to the bioengineer, they foreshaoowed observations 
and experiments that would suggest that the use of rare codons 
for specialized or diff^tlatic^pccihc functions in bacteria 
ougot provide a general mechanism to ensure proper temporal 
and spatial expression of the encoding genes. Although this 
fcypotbesu a still far from established, work m several labora- 
tories has provided indirect evidence suggesting that rare codon 
osage is of functional significance in restricting or specifying 
Impropriate gene expression, in this minireview I summarize 
Uie evidence concerned with this issue and reiterate the sugges- 
tion that the complement of tRNAs found in a particular bac- 
tenum under one set of growth conditions may differ from that 
[ound under another set of growth conditions. 
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2, Codon mage sod gax expression 

JUNiving organism, posses characteristic GC contents and 
prefattd set, of codons used for protein biosynthesis. GC 
content is a major determinant of codon usage, and codon 
^ptaaonmd^CAI values) have proven to^ovide^" 
abb. ernpmcaBy determined estimate of gene expression kvd 
for s^ciflcgroupsof organisms p-lj. *NA availability during 

drtenmmng itecharaaeristfo preferred codon usage. However 
genes obtained by horizontal transmission from a^yC^ 

o^erent from those of the recipient bacterium approach tjfc 

^LT^ r^ 5 " 0 ° f ^ ^ "XP^ Host only after 
h^^of milhons of years W m. fact suggests, first, that 
*fftrtaK™ m codon usage must have arisen relatively early 
during profcaryotic evolution, and, second, that die pressure for 
a newly acquired gene to assume the codon usage of the host 

eS2£%Sl* -f*?" 6 «•**»■»■ »evertJKles7b« 
expressed at lugh level, when cloned behind a strong promoter 
{see, for example. [6D has led some investigators to suggest Zt 

(Won f4] It should be pointed out in this regaroTTtv 
mabjhty to demonstratea regulatory effectwith one set of genes 
expressed under a grven set of experimental conditions doeTnot 
rule out the possibutty of an analogous regulatory function for 
pother set of genes expressed preferentially under a different 
set of wndmons. Betow I summarize evidence suggesting that 

may be r^utated at the translation.! level by sdective use of 
rare codons in relevant structural genes (see Table 1). 

Jl 1 2?\ W ° ^ notEd ^ ^ ncs encoding proteins 
of the photosynthetic apparatuses {reaction center and Eght 
harvesting proteins) of the Gram-negative purple bacteria, 
Rhodobacter capsuktus and A spheroids, differed Tn 'codon 
usage From that of genes encoding enzymes of the fructose 
utihzanon pathway [7J. While most codons occurred with- sim- 
ilar frequence in these two groups of genes, a few were found 
to predominate, or be present exclusively in one or the other 
S!! T ^ 2 r °/ rc P««ntativc examples). Moreover, 
^ ^Tu" a ? in utilization or 

carotenoid biosynthesis, that were expressed under both 
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Table 1 

Selective use of rare oodons postulated to con trol specialized functions 
in bacteria 



Function 


Organism 


Codon 


Amino 
Acid 


Photosynthesis 


Rhodobacier capsutatus 


ecu 


Ala 


Fructose utilization 




cue 


Leu 


Rhodobacter capsutatia 


AAU 


Asn 


Solvemogcnesis 




UGU 


Cys 


Clostridium 


AOG 


Thr 


Ami mycelium 


acetobutylicum 






Sirepiomyces coeUcoior 


UUA 


Leu 


development 




Fimbria! production 


Escherichia coil 


UUG 


Leu 



heterotrophic and phototrophic conditions, exhibited rare 
codon usage frequencies that were intermediate between those 
found in the photosynthetic and rmctose-catabolic genes 
(Table 2). These differences were shown to be statistically sig- 
nificant- It was suggested that different tRNA pools were pres- 
ent under phototrophic vs. heterotrophic growth conditions, 
and that growth conditions might influence the relative rates 
Of transcription of the tRNA genes and their cognate amino 
acyl tRNA synthetases. Differences in codon usage might gen- 
erally allow operation of novel post-transcriptionai regulatory 
mechanisms. It seemed' reasonable to suppose that charged 
tRNA availability and codon usage could provide a safeguard 
against expression of specialized genes under inappropriate 
conditions' [7). 

4. Rare codon usage as a potential regulator of sohentogenesis 
in Clostridium 

Sauer and Diirre noted in 1992 that a mutational defect 
preceding the gene thrA encoding a rare tRNA, tRNA&c in 
the low GC Gram-positive bacterium, Clostridium acetobutyti- 
cwm, gave rise to the absence of solventogenesis [8]. This strict 
anaerobe is a spore-forming bacterium that produces acetone 
and butanol only during a late stage in the growth cycle. The 
shift to solventogenesis is accompanied by a series of morpho- 
logical and physiological changes in motility, shape, and gran- 
ulose content, culminating in endosporc formation. Sauer and 
Durre noted that the ACG codon is rarely used and is largely 
restricted to genes either expressed at the end of exponential 
growth or involved in the inducible uptake or metabolism of 
minor carbon and nitrogen sources [8J. Because these investiga- 
tors did not conduct statistical analyses, it was not possible to 
state that the observed differences in codon usage reflected a 
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unique characteristic of specific groups of genes encoding spe- 
cialized functions rather than depressed levels of expressivity . 
14]. Nevertheless, the potential implications of the observation^- 
were clear. As in the case of phototrophic vs. heterotrophic^ 
gene expression in Rhodobacter, codon usage in Clostridium" 
mayprovidca safeguard to insure proper expression of certain 
stationary phase vs. log phase genes. 

5. Codon usage as a determinant of the differentiated state in * 
Streptomyccs 



Species of the high GC Gram-positive genus Sireptomyces 
undergo fungal-like differentiation with the sequential forma- 
tion of vegetative and aerial mycelia [9,10]. The fact that only 
the latter structures contain spores reflects the spatial and tem- 
poral constraints imposed upon the process of terminal differ, 
entiation within this genus. The industrial importance of these 
organisms is related to their capacity to produce an array of 
antibiotics and useful secondary metabolites during the post- 
exponential growth phase. Although these strict aerobes have 
many of the enzymatic attributes of their low GC Gram-posi- 
tive cousins, their regulatory mechanisms appear to be remark- 
ably different [1 1-13], 

Leskiw et al. [14] and Fernandez-Moreno et al. [15] first 
observed that, in Sireptomyces coeficofor, a genetic defect in the 
gene bldA, encoding a rare tRNA, tRNA& A [16,17], blocked 
aerial mycelium formation and prevented efficient phenotypic 
expression of several genes containing the rare UUA codon. 
bldA mutations (including deletions) did not interfere with veg- 
etative growth but did prevent aerial mycelium formation and 
antibiotic production (see [18] for a review). It was suggested 
that this rare codon occurred preferentially in genes concerned 
with differentiation and antibiotic production as contrasted 
with those required for vegetative growth. 

More recently, evidence was presented suggesting that ma- 
ture tRNAj5jj A accumulates in ever increasing amounts as 

5. coelicohr cultures age, and that the temporally regulated 
accumulation of this mature tRNA species correlates with an 
increase in efficiency of UUA-conmining messenger RNA tran- 
scription and/or translation fll9J; but see also [20]). It seemed 
to exert regulatory effect* on events occurring during late 
growth, including morphological differentiation and antibiotic 
production. 

6. Rate codon usage and the control of fimhrial production in 
E. coii 

A recently noted example of potential rare codon control of 



1 



Table 2 

Examples of differential codon usage in photosynthetic versus heterotrophic gene in Rhodobactt, 

Codon Amino acid Fractional codon usage for each amino a cid 
Jw pfto 



Elevated codon usage mpho genes 
Elevated codon usage in />« genes 



ecu 
cue 

AAU 
UGU 



Ala 
Leu 
Asn 
Cys 



0.02 
0.11 
0.44 
0.25 



0.11 
QJ5 
0.05 
0.00 



nil 

0.03 
0.16 
0.16 
0.13 



Toial 



0.05 
0.19 
0.18 
0.12 



Ch>, photoynUK.kg™*, total of 1976 codon* analyzed for A SSS^ffSS"^!? 11 ??" * c<p»A«^ 
Ml Of » co*™ analyzed for B. «*™W. Total, a -^J^XStiSESEgR ^S^SSm 
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specialized gene expression concerns the production of type 1 
fimbriae in the Gram-negative enteric bacterium, Escherichia 
S6U strain F18, which is ablc.to colonize the mouse colon pi]. 
BnighofT et ai. {22] isolated a 6\5 tb £ ro/i sequence that 
enhanced the colonizing ability of strain F1S and simultane- 
ously stimulated synthesis of type 1 fimbriae. The gene respon- 
sible for this stimulation proved to be the leuX gene, encoding 
g tRNA specific for the fare leucine codon UUO. This gene is 
$ singe copy at 97 min on the E coU chromosome, and the 
encoded tRNA species (LeoX) is apparently dispensable for 
p owth [23J. No effect on growth rate was observed when kuX 
was mutated p4]. Another tRNA 1 *, LeuZ, specific for the 
'rm|H UUA leucine codon, presumably recognizes UUG by *wobbfc% 
and can thereby substitute adequately For LeuX, at least with 
respect to the expression of genes encoding functions required 
for vegetative growth. 

The mechanism by which leuX gene expression influences 
type 1 fimbria! production is probably complex. The Jim A gene, 
encoding the principal type 1 fimbrilm, lacks UUG codons 
altogether [25). However, synthesis of type 1 fimbriae is subject 
to phase variation due to inversion of a 314 bp DNA segment 
dial includes thejSwU promoter ([26]; hut see also [27]). The 
ratio of the products of two Jim genes, fimB and fimE, deter- 
mine the frequencies of inversion in the two opposing directions 
with high levels of FimB favoring the ( ofT to 'on* transition. 
Since JimB has six UUG codons while JimE has only two [28], 
it has been proposed that LeuX influences type 1 fimbria! pro-* 
duction by controlling /mfl expression more stringently than 
that of/Sro£[29]. In this regard it is interesting to note that leuX 
expression is apparently regulated by two proteins (of 22 and 
26 kDa) encoded by genes adjacent to leuX. Deletion analyses 
have suggested that the 22 kDa protein is a transcriptional 
activator while the 26 kDa protein is a repressor of feu* expres- 
sion. These proteins may therefore be indirect regulators of 
type 1 fimbrial phase variation, and consequently of net fim- 
bria! production. 

Various E colt strain* are collectively capable of producing 
at least six distinct virulence- related fimbriae, each exhibiting 
specificity for and mediating adhesion to a specific mammalian 
cell surface rnacromoleculc [30,31]. Expression of these fim- 
briae is often subject to phase variation in agreement with the 
belief that successful colonization of the host depends on the 
timely expression and subsequent silencing of specific viru- 
lence-related genes, depending on the stage of infection. A 
recent analysis has revealed that the leuXgsrK of uropathogenic 
£ colt strain 536 encompasses one of several sites responsible 
for genetic instability [32]. internal to kuX is one of two 18- 
nucleotide direct repeats that serve as functional sites for exci- 
sion of a 190 kb DNA segment. This segment encodes, among 
other functions, P-relatcd fimbriae. Excision of this DNA seg- 
ment silences expression of ieuX (possibly controlling type 1 
fimbrial synthesis, as noted above) as well as expression of the 
genetic apparatus encoding P-relatcd fimbriae. Aa bacterial 
cells lacking 'excess DNA baggage' and incapable of making 
fimbriae divide with increased growth rates, it may be that 
timely excision provides the bacterium that has already estab- 
lished itself in the host organism with pathogenic advantage 
[32]. Based on the proposed regulatory role of rare tRNAs m 
controlling fimbrial production, we suggest that it was not 
accidental that tRNA loci have come to serve as sites of viru- 
; lence-associated DNA insertion/deletion phenomena. 
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7. Conclusions and perspectives 



How important are the postulated regulatory mechanism* 
giving rise to codon-controlled phenotypic gene expression? 
Are they generally operative for the control of starvation-m- 
duoed or stress-related vs. vegetative gene expression in E coli 
and other bacteria [33]? Do they function to safeguard proper 
temporal expression of sporuktion (jpoHpecific genes at any 
one stage or during several different stage* in the well-defined 
programs o f differentiation of various Bacillus species [34]? Do 
they play a role m the control of growth phase-specific or 
condition-selective gene expression, e.g. expression of genes 
concerned with bioluminescence in Vibrio species [35,36], 
bacteriorhodopsin-mediated photosynthesis in aichacbajcteria 
[37], or induction of vimlence-spedfic genes in bacterial patho- 
gens of plants, animals and other bacteria [38-41]? 

The first step towards answering these important questions 
would seem to be to analyze functionally related groups of 
bacterial genes for statistically significant differences in codon 
usage, as reported by Wu and Saier [7] for the photosynthctic 
vs. heterotrophic genes of JVtodobacter. A second step would 
be to measure variations in the cellular concentrations of spe- 
cific rare tRNA species made under relevant but differing phys- 
iological conditions. The third step would be to establish a 
causal relationship between rare codon oocurance/'tRNA level 
and gene expressivity. Such studies may lead to recognition of 
novel codon usage-mediated mechanisms for ensuring the 
proper expression of temporally and spatially regulated genes 
in prokaryotic microorganisms. The relevance of such mecha- 
nisms to eukaryotic organisms, including protozoa, fungi, 
plants and animals, could then be ascertained by the applica- 
tion of straight-forward comparative approaches. 
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INTRODUCTION 

The new compilation of tRNA Sequences and Sequences of tRNA genes contains in addition to 3279 
sequences of the last edition from 1998 (1) the completely new Genomic tRNA Compilation including 
the sequences of tRNA genes from complete genomes published up to January 2002. The current 
Database consists of three parts: 

1. Compilation of tRNA Genes ( MS Excel® file . ZiPed ) 

2. Compilation of tRNA Sequences ( MS Excel® file . ziPed ) 

3. Genomic tRNA Compilation ( MS Excel® file . ZJPed) 



Compilation of tRNA Genes, 

is a summary of the sequences of tRNA genes published in the literature and databases up to the end of 
1998. It contains tRNA genes of all organisms and organels, but is not updated since January 1999. This 
table contains about 500 sequences of cytoplasmic tRNA genes that are not included in the Genomic 
tRNA Database. Most of the tRNA gene entries in this table have references of the publications in which 
the sequence was communicated. 
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is a summary of tRNA sequences, including modified bases and references of the publications The 
references are restricted to the first publication of the complete sequence unless additional information 
r%L m ° f ' 0n ' COITections ' etc ) wa s later obtained. In such cases additional references were 
added. This compilation is updated up to January 2002. The table contains the known tRNA sequences 
of all organisms including organella This is the continuation of the original tRNA compilation first 
published m 1978. 

Genomic tRNA Compilation, 

is a new addition to the Database. This is the most complete compilation of the sequences of 
cytoplasmic tRNA genes derived from complete genome sequences included into DNA databases Since 
sequences of tRNA genes originated from cellular organelles frequently can not be processed to the 
general cloverleaf scheme, they were not included in the Genomic tRNA Compilation. There are 
specialised databases dealing with these sequences (see links below). 

Current Genomic tRNA Compilation consists of about 3700 tRNA gene sequences from 63 organisms 
covering archaea bactena, higher and lower eukarya. The database includes the tRNA genes sequences 
collected in GtRDB (2) i as well as those from the additional complete genomes found in DNA databases 
It the genomes of the different strains of the same organism were sequenced, the corresponding tRNA 
genes were added to the database independently. 



PRESENTATION OF SEQUENCES 

Compilation of tRNA Genes and Compilation of tRNA Sequences 

In rl er loxff'u %e , 3 com P uter ™ alignment of sequences is used which is most compatible 

with the tRNA phytogeny and known three-dimensional structures of tRNA (3, 4). The corresponding 
numbering system is shown in Figured Positions in particular sequence which are not filled (gaps in the 
generalised structure) are indicated by a dash. All nucleotide insertions are commented and denoted by 
underlining at the place of insertion. 7 

This compilations use a one-letter code for all nucleotides including modified ones. For standard 
nucleotides, adenosine, cytidine, guanosine, thymidine and uridine the usual abbreviations A C G T 
and U, respectively, are used. To designate modified nucleotides, the other ASCII signs are employed 
(see table "Intro" in the corresponding MS Excel® file). Terminology and structure of the modified 
nucleosides occurring in tRNAs were used according to (5) and (6). 

Sequences are presented as MS Excel® files. Each sequence in the compilation occupies two 
consecutive rows^ The first row begins with the unique six-position identification code of the sequence 
• v ? v ? ° r RNA ' res P ectiveI y: a one-letter code for the amino acid, X for methionine- 
initiator Z for selenocysteine; the three-digit code specifying the organism and one digit for isoacceptor 
number). After this, the sequence of the anticodon is given, followed by the abbreviated name and the 
kingdom of organism, and the sequence (99 standard positions). The second line begins with the sign 
and contains i the information about base-pairing (double helical regions only, tertiary interactions are not 
annotated). Nucleotides involved in Watson-Crick pairs are marked with ■=•, the GU pairs are indicated 
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The database is organised as an MS Excel® workbook. All the information collected are split into 
different indexed tables according to the type of data (specificity, sequence, organism, etc.) and the 
descriptions of certam genes are summarised in the main worksheet that includes the relations between 
the data tables. The information can be obtained by filling the query form that allows to enter the simple 
search criteria and to select the type of data to be displayed. The result of search is presented as a table 
containing the descnption of the genes found. This includes unique id, amino acid specificity anticodon 
sequence organism name and taxonomy, strain, original database source, position of the gene in 
genome, l.terature reference sequence, basepairing and additional comments. Sequences are aligned in 
the same way as it was described above for the tRNA compilations. 

In addition to the plain text table one can explore the result of search by presenting the sequences in a 
cloverleaf form (Fi gu ^,). Itis possible to scroll the found sequences one by one or to select directly the 
sequence of interest from the result table. The presentation supports colour code for different structural 
features in the canonical cloverleaf model. 

Simple statistical information on the occurrences of certain bases at given positions and the preferences 
m basepairing also can be obtained on a special data sheet. 



Useful links: 

The RNA Modification Database 
http:^rnedliKme d.utah.edu/RN ArnnHs 

A database for plant mitochondrial tRNA genes and molecules 
httpj//bio-www. ba.cnr.it:8000/ BioW WW/#PLMItRNA 



Compilation of mammaliam mitochondrial tRNA genes 
hflR^mamit-tr na.u-strasb g.fr 

GtRDB: The Genomic tRNA Database 
httpV/r na. wustl.edu/tRN Arih 
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tRNA databa se searclnng^nging 

Internet service that allows to find records in the database according to multiple search criteria 
Complicated sequence-based queries can be formed (Updated for the data in Compilation of tRNA 
Genes and Compilation of tRNA Sequences up to the end of 1998). 



tRNA-Editor 

Researchers who wish to perform an advanced search for tRNA sequences according to several criteria, 
e^g. anticodon, amino acid specificity, modified nucleoside, or wish to print the requested sequences in' 
the cloverleaf form can g^wnioad appropriate Windows 3.1 based software as a 900kB ZIPed file 
(Updated for the data in Compilation of tRNA Genes and Compilation of tRNA Sequences up to the end 
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ABSTRACT 

Translation of the yeast retrotransposon Ty/ TYAl{gag)-TYBl(pol) gene occurs by a +1 ribosomal 
frameshifting event at the sequence CUU AGG C. Because overexpression of a low abundance tRNA- 
Arg(CCU) encoded by the HSX1 gene resulted in a reduction in Ty/ frameshifting, it was suggested 
that a translational pause at the AGG-Arg codon is required for optimum frameshifting. The present 
work shows that the absence of tRNA-Arg(CCU) affects Tyi transposition, translational frameshifting, 
and accumulation of mature TYB1 proteins. Transposition of genetically tagged Tyi elements 
decreases at least 50-fold and translational frameshifting increases 3-17-fold in cells lacking tRNA- 
Arg(CCU). Accumulation of Tyi-integrase and Tyi-reverse transcriptase/ribonuclease H is defective 
in an hsxl mutant. The defect in Tyi transposition is complemented by the wild-type HSX1 gene or 
a mutant tRNA-Arg(UCU) gene containing a C for T substitution in the first position of the anticodon. 
Overexpression of TYAI stimulates Tyi transposition 50-fold above wild-type levels when the level of 
transposition is compared in isogenic hsxl and HSX1 strains. Thus, the HSX1 gene determines the 
ratio of the TYAI to TYAI-TYBl precursors required for protein processing or stability, and keeps 
expression of TYB1 a rate-limiting step in the retrotransposition cycle. 



THE Saccharomyces cerevisiae retrotransposon Ty/ 
is a mobile genetic element that replicates via 
an RNA intermediate (reviewed by Boeke and Sand- 
meyer 1991; Garfinkel 1992). The transposition 
cycle of Tyi elements resembles several important 
steps in the replication of retroviruses. Tyi protein 
maturation by Tyi -protease (PR) and reverse tran- 
scription take place within Tyi virus-like particles 
(Tyi-VLPs), which appear to be absolutely required 
for the transposition process. The Tyi genome con- 
tains two genes, 7YAi and 7TBi, which correspond to 
the gag and pol genes of retroviruses, respectively 
(Clare and Farabaugh 1 985). As with certain retro- 
viral pol genes (reviewed by Hatfield et al. 1992), 
expression of TYB1 requires programmed ribosomal 
frameshifting (Clare, Belcourt and Farabaugh 
1988). Ribosomal frameshifting solves two problems 
encountered in the life cycle of a retrovirus or retro- 
transposon. First, since catalytic Pol proteins, such as 
reverse transcriptase/ribonuclease H (RT/RH) and 
integrase (IN), are usually found in much lower 
amounts than the structural Gag proteins, requiring 
a frameshift event for pol expression is an effective 
strategy of gene regulation. Second, since Pol proteins 
function within a particle, creating a Gag-Pol fusion 

Genetics 135: 309-320 (October, I99S) 



protein by frameshifting delivers Pol proteins to the 
correct compartment. 

The TYAI-TYBl fusion protein is synthesized by 
a +1 frameshifting event in the TYAI sequence CUU 
AGG C (Belcourt and Farabaugh 1990). Ribo- 
somal pausing at a rare AGG-arginine codon and 
slippage of a leucyl-tRNA from CUU to UUA are 
required for frameshifting. A single-copy tRNA- 
Arg(CCU) gene that recognizes the AGG codon is 
located on chromosome X (Gafner, De Robertis and 
Phiuppsen 1983). Belcourt and Farabaugh (1990) 
have shown that overexpression of the tRNA- 
Arg(CCU) gene reduces Tyi frameshifting. Tyi 
transposition is also reduced when the level of the 
tRNA-Arg(CCU) is increased (Xu and Boeke 1990). 
These results suggest that the low abundance of 
tRNA-Arg(CCU) promotes frameshifting. Recently, 
we have identified this tRNA gene as the HSX1 gene 
involved in the heat shock response (Kawakami et al. 
1992). Even though there is only one copy of the 
HSX1 gene (Gafner, De Robertis and Philippsen 
1983), an hsxl disruption mutant is viable. Appar- 
ently, the AGG codons normally decoded by the sin- 
gle-copy HSXI gene are decoded by another tRNA 
[probably by the near-cognate tRNA-Arg(UCU) 
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TABLE 1 
Yeast strains 



Strain 


Genotype 


Plasmid 


Source or 
reference 


DMy51 


MATa ura3-167 his3A200 /«*2A trplM GAL 


pGTyM-Bn« (pDI09) 


This work 


DMy94 


MAT* uro3-52 his3A200 tys2 trpl-289 GAL 


This work 


JC287 


Mata ural-167 fus3A200 leu2GB TyM 46::lacZ TyiniAwiAI-263 GAL 




M.J. Curcio 


JC344 


MATa ura3-167 his3&200 ieu2GB TyM46::/acZ Ty/mAw3AI-270 GAL 




M.J. Curcio 


KKI56 


JC287: hsxI::LEU2 




This work 


KK157 


JC344; hsxl::LEU2 




This work 


KK240 


MATa ura3 kis3 leu2 trpl hsx1::HIS3 




This work 


KK242 


MAT* ura3 kis3 leu2 trpl 




This work 


KD198-16A 


MAT* his4&5 ura3 argil GAL 




K.J. Durbin 


DG130J 


JC344 


pGAL t-lacZ 


This work 


DG1302 


JC344 


pGTy/-H3rifo 


This work 


DG1305 


KK157 


pGALl-tacZ 


This work 


DG1306 


KK157 


pGTy;-H3n« 


This work 


DG1S33 


JC344 


pGTyl-mneo::Sacl-I702 


This work 


DG1334 


KK157 


pGTy l-H$neo;:Satf-1702 


This work 


DG1344 


JC344 


pGTyA 1 neo(PGK 1 icr.) 


This work 


DG1347 


KK157 


pGTyA 1 neo(PGK 1 ter. ) 


This work 



gene]. In this paper, we describe the effects of an hsxl 
disruption mutant on Tyl frameshifting, transposition 
and protein processing. 

MATERIALS AND METHODS 

Yeast strains, plasm ids, general genetic methods and 
media: The strains used for the Ty / transposition assays are 
listed in Table 1 . Strains KK240 (MATa ura3 his3 leu2 trpl 
hsxl::H!S3) and KK242 (MATa ura3 his3 leu2 trpl) were 
used to test Ty7 frameshifting. These strains were derived 
from an hsxl::HIS3/HSXl diploid strain (Kawakami et al. 
1992). 

The plasmids pMB38-9merWT and pMB38-9merFusion 
contain the frameshift heptamer fused to Escherichia coli 
lacZ gene in the +1 TYB1 reading frame and the 0 TYAJ 
reading frame, respectively (Belcourt and Farabaugh 
1990). The plasmid pMB38-9merFusion(w/o AGG) con- 
tains the AGG-less 0 reading frame [GAT CCG CTG ACA 
CTT GGC CAT GAG GTA C (the frameshift region is 
highlighted)] fused to lacZ. The plasmid pKK67 was con- 
structed by cloning the 230 base-pair (bp) wild-type HSXI 
DNA, amplified by polymerase chain reaction (PCR) (Saiki 
et al. 1 985) into the URA J-based centromere-plasmid YCp50 
(Rose, et al. 1 987). The plasmid pKK68 carrying the mutant 
hsxl(Mlul*) gene was constructed by digestion of the plas- 
mid pKK67 with Mini, fill-in synthesis with Klenow DNA 
polymerase, and ligation to a Sail linker. The hsxl::HIS3 
and hsxl::LEU2 disruption alleles were constructed by mod- 
ifying the same Mlul restriction site and ligation to a Clal 
fragment containing the HIS3 gene, or an Mlul-Clal frag- 
ment containing the LEU2 gene (kindly provided by P. 
Rogan). The plasmid pKK69 was constructed by cloning 
the PCR-amplified 1 12-bp wild-type SUP201-0 gene (Tm- 
reos, Penn and Greer 1984; Morishita and Uno 1991) 
into the URA3-b&sed centromere-plasmid pRS3 1 6 (Sikorski 
and Hieter 1989). The plasmid pKK71 carrying the 
SUP201-0-l(CCU) gene was constructed by digestion of plas- 
mid pKK69 with Mlul and BamHl and ligation to a 63-bp 
synthetic double-stranded DNA containing the C for T 
substitution at 3' base of the anticodon. The EcoRl-BamHl 
DNA fragments containing the mutant and wild-type tRNA 



genes were prepared from plasmids pKK67, pKK68, pKK69 
and pKK7 1 , and subcloned into the TrtPZ-based centrom- 
ere-plasmid pRS314 (Sikorski and Hieter 1989). These 
subcloning procedures generated plasmids pKK73 (derived 
from plasmid pKK67), pKK74 (from pKK68), pKK75 (from 
pKK69), and pKK76 (from pKK7 1). The plasmid pGTyiA- 
Bnco (also known as plasmid pD109), with the TyJ frame- 
shift correctly removed (Belcourt and Farabaugh 1 990), 
was constructed from a transposition-competent pGTy7- 
H3/Tyi-912 hybrid plasmid by oligonucleotide-bridge mu- 
tagenesis (Mandecki 1986). The frameshift mutation and 
tRNA sequences were confirmed by chain-terminating DNA 
sequencing (Sanger, Nicklen and Coulson 1977) using 
Sequenase 2.0 (U.S. Biochemical Corp.). The plasmid 
pGTyA/n<fo (PGK ter.), kindly provided by P. Rogan, was 
constructed by replacing almost all of the pGTy7-H3 TYB1 
gene (from a Bglll site located at position 1702 to the end 
of the element) with the bacterial neo gene and the PGK] 
transcriptional terminator. Standard techniques were used 
for all molecular cloning procedures (Sambrook, Fritsch 
and Maniatis 1989). 

The h$xl::HIS3 and hsxl::LEU2 disruption mutants were 
constructed by single-step gene disruption (Rothstein 
1991). Plasmids were introduced into cells using the trans- 
formation procedure of Ito et al. (1983). All yeast media 
and standard genetic techniques were those described by 
Rose, Winston and Hieter (1990). 

Transposition assays: Tylmhis3Al and Ty lmade2 Al 
transposition assays were performed as described previously 
(Curcio and Garfinkel 1991 , 1992), and will be presented 
briefly here. For detecting spontaneous Tylmhis3Al trans- 
position events, liquid cultures were inoculated at low den- 
sities (about 2 X 10 5 cells/ml) and grown to saturation at 
20° in YPD or in SC-ura (glucose). A portion of each culture 
was spread on SC-his or SC-his-ura (glucose) plates and 
incubated at 30°. The cultures were titered on YPD or 
SC-ura (glucose) plates. For detecting chromosomal 
Ty lmhis3 Al transposition events in the presence of a 
pGTy? helper plasmid, cells were grown on SC-ura (galac- 
tose) plates for 7 days at 20°, or an overnight SC-ura 
(glucose) liquid culture was diluted 50-fold into SC-ura 
(galactose) liquid medium and incubated with aeration for 
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3 days at 20*. TyimAirJAI transposition events were de- 
tected as His* papillae by replica plating cells from the SC- 
ura (galactose) to SC-his-ura (glucose) plates, followed by 
incubation at 30° for 3 days. To determine the number of 
Ty/mAtsiAI or Tylmadt2Al transposition events in galac- 
tose-grown liquid cultures, the cells were concentrated, 
spread on several SC-his-ura (glucose) or SC-ade-ura (glu- 
cose) plates, and incubated at 30* for 3-5 days. Cells were 
titered on SC-ura (glucose) plates. Ty lneo and TylA-hneo 
transposition events were detected as described previously 
(Boeke, Xu and Fink 1988; Curcio, Sanders and Garfin- 
kel 1988) with the following minor modifications. Diploid 
strains were constructed by mating strains KD198-16A with 
strains DG1302 or DG1306, or by mating strains DMy51 
and DMy92 (Table 1). The resulting diploids were induced 
for Ty/ transposition on SC-ura (galactose) plates as de- 
scribed above- After segregation of the pGTy lneo plasmid 
from the strains, the level of resistance to the antibiotic 
G418 (Gibco) was determined by growth on YPD plates 
containing a final G4 18 concentration of 500 Mg per ml (for 
diploids derived from mating strain KD198-16A with strains 
G1302 or DG1306) or 75 jig per ml (for diploid derived 
from mating strain DMy51 with DMy94). 

Ty/ RNA levels and Ty7mAts3AI splicing efficiency: 
We isolated total RNA from ksxl and hsxl::LEU2 strains by 
established procedures (Curcio, Sanders and Garfinkel 
1 988; Rose, Winston and Hieter 1 990). Northern analysis 
was used to analyze Ty/ RNA levels (Curcio, Sanders and 
Garfinkel 1988; Curcio and Garfinkel 1992), and re- 
verse transcription-PCR (RT-PCR) was used to estimate 
TyimAijiAI RNA splicing efficiency (Wang, Doyle and 
Mark 1989). The total amount of RNA transferred to 
hybridization membranes was estimated by staining with 
NAQ-STAIN, a reversible fluorescein-based stain devel- 
oped by Integration Separation Systems. Transcripts from 
the PYKl and ACT1 genes were used as internal loading 
standards. RNA sequences that span the region where the 
artificial intron (Al) was inserted in HIS3 (Curcio and 
Garfinkel 1991) were amplified using the H/S5-specific 
oligonucleotide primers CTCC ACGCGCC A GT AGGGCC 
(for DNA amplification) and ATGACAGAGCAGAAAGC 
CC (for reverse transcription and DNA amplification). The 
amplified products were separated by agarose gel electro- 
phoresis through a 2% NuSieve/1% SeaKem (FMC Bio- 
products) gel, stained with ethidium bromide, and photo- 
graphed. The resulting negatives were scanned using an 
LKB Ultroscan XL enhanced laser densitometer. Relative 
splicing efficiencies were estimated by the amount of the 
amplified products. The splicing efficiency is defined as the 
amount of 334-bp spliced product over the amount of 
spliced plus 438-bp unspliced products. 

Immunoblot analysis: Total yeast protein isolation, poly- 
acrylamide gel electrophoresis, protein transfer, and anti- 
body reactions were performed as described previously 
(Youngren et al. 1988; Garfinkel et al. 1 991 ). Antibodies 
were added in at least 10-fold excess, as determined by 
titration experiments. Ty/-VLP antibodies were previously 
shown to react with TYA1 and TYA1-TYB1 precursor 
proteins, but not with TYB1 proteins (Adams et al. 1987; 
Youngren et al. 1988; Garfinkel et al. 1991). Ty/-VLP 
antibodies did not show a dramatic difference in avidity for 
TYAI vs. TYA1-TYB1 precursor proteins, as determined 
by titration experiments (A.-M. Hedge and D. J. Garfin- 
kel, unpublished results). Tyi-VLPs were isolated by the 
method of Eichincer and Boeke (1988), except the final 
continuous sucrose gradient was omitted. Equal amounts of 
protein (approximately 20 fig per lane) were loaded onto 
SDS-8% polyacrylamide gels. Protein concentrations were 
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verified by staining gels run in parallel with Coomassie blue. 
Cross-reactivity of immunoblotted proteins with antisera 
that recognize the mature proteins p54-TYAl (Ty/ VLP 
antiserum; Adams et aL 1987; Youngren et al. 1988), p90- 
TyMN (B2 antiserum; Youngren et al. 1988), p60-Tyi- 
RT/RH (B8 antiserum; Garfinkel et al. 1991), and their 
respective precursor proteins were detected using the ECL 
chemiluminescent detection system (Amersham). 

Tyi frameshifting efficiency: /J-Galactosidase assays and 
the efficiency of Ty/ frameshifting were determined as 
described previously (Belcourt and Farabaugh 1990). 
Briefly, six transformants of each plasmid were each assayed 
in triplicate for 0-galactosidase activity. The frameshifting 
efficiency is measured by determining the ratio of j8-galac- 
tosidase activity produced from the construct requiring a 
+ 1 frameshift to express lacZ (pMB38-9merWT) to that of 
a construct in which the upstream and downstream genes 
are fused in frame [pMB38-9merFusion and pMB38- 
9merFusion(w/oAGG)]. 

The efficiency of Ty/-H3 frameshifting was also esti- 
mated from immunoblot analysis. Strains DG1333 
(pGTy l-H$neo::Satf-l 702, HSX1) and DG1 334 (pGTy/- 
H3neo::SacI-l702, ksxl::LEU2) were constructed by trans- 
forming the plasmid pGTy ]-H$neo::SacI-l 702, which con- 
tains a Ty /-PR mutation (Youngren et al. 1 988), into strains 
JC344 and KK157, respectively (Table 1). Total protein 
isolated from galactose-grown cultures of strains DG1333 
and DG1334 was analyzed by immunoblotting using Tyi- 
VLP antiserum. To determine the ratio of p58-TYAl to 
pl90-TYAl-TYBl protein, exposures of the resulting blots 
were scanned using a laser densitometer. The efficiency of 
Ty/ frameshifting equals the amount of pl90-TYAl-TYBl 
protein divided by the total amount of p58-TYA 1 plus pi 90- 
TYA1-TYB1 protein. 

RESULTS 

Tyi transposition is inhibited in an hsxl disrup- 
tion mutant: We determined whether a disruption 
mutation of HSX1 affects Ty/ transposition using two 
assays that monitor transposition of chromosomal ele- 
ments marked with the his3A\ retrotransposition in- 
dicator gene (Curcio and Garfinkel 1991), as well 
as by monitoring the transposition of plasm id-borne 
pGTy lneo and pGTy Jmade2A\ elements (Boeke et al. 
1985; Boeke, Xu and Fink 1988; M.J. Curcio and 
D. J. Garfinkel, unpublished results). The his3Al 
gene is a yeast HIS 3 gene interrupted by an artificial 
intron (A I) in the antisense orientation. The his3 Al 
sequences are inserted in a Tyi element at a unique 
restriction site located between the TYB1 gene and 
the downstream long terminal repeat, such that the 
intron is on the sense strand of the Tyi element. 
Placement of marker genes at this position of a Tyi 
element does not severely inhibit transposition. Since 
splicing and retrotransposition of the marked Ty 
RNA gives rise to His + cells, the relative efficiency of 
TyimAw.?AI transposition can be monitored by plat- 
ing cells on media lacking histidine. An ade2Al retro- 
transposition indicator gene has also been developed 
(M. j. Curcio and D. J. Garfinkel, unpublished 
results). 

First, the relative efficiency of TyimAw5AI trans- 
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TABLE 2 



Ty/mAt»3Al transposition in an hsxl disruption mutant 



Genotype 


TyJmAuJAJ 


His* colonies/ 
total cells 
(XI0 T ) 


Relative 
transposition 
efficiency 


HSXl 


Ty7mAi5JAI-265 


25/1.6 
46/1.6 
40/1.8 
36/1.9 
30/1.9 


2.0 X 10~ 6 


hsxl::LEU2 


TyJmAuJAI-263 


0/2.3 
0/2.3 
1/2.2 
2/2.2 
0/2.1 


2.7 X 10 -8 


HSXl 


TyJmAu.?AI-270 


28/1.6 
34/1.6 
22/1.4 
32/1.4 
36/1.6 


2.0 X 10"* 


hsx!::LEU2 


Ty; m AiiiAl-270 


3/1.7 
0/2.3 
0/2.4 
0/2.1 
1/1.8 


3.8 x 10- s 



The Ty/mAijiAI-263 dement is present in HSXl strain JC287 
and ksxl::LEU2 strain KK156. The Ty/mAuJAI-270 element is 
present in HSXl strain JC344 and hsxl:;LEU2 strain KK157. Each 
measurement represents the results of one of five independent 
cultures. The relative transposition efficiency is the mean fraction 
of total colonies that are His*. To estimate the efficiency of Tyl 
transposition, the relative transposition efficiency should be multi- 
plied by a factor of 8, to account for the splicing efficiency of the 
TyJmAiiMI transcript, and by a factor of 11, to account for the 
effect of introducing the his3M marker gene into a TyJ element 
(Curcio and Garftnkel 1991). 

position in isogenic HSXl and hsxl::LEU2 strains con- 
taining single marked chromosomal elements 
Tyim/iisiAI-263 or Ty lmhis3 AI-270 was deter- 
mined (Table 2). These unspliced Ty lmhis3 Al ele- 
ments were identified after galactose-induction of a 
strain containing plasmid pGTy J-HSmAwiAI, and are 
present at different chromosomal locations (Curcio 
and Garfinkel 1991). There was a 53- or 74-fold 
decrease in the efficiency of Ty7mAi5.?AI-263 or 
Ty7mAw5AI-270 transposition, respectively, as mon- 
itored by the number of His + colonies in a hsxl::LEU2 
mutant background. The transposition defect in the 
hsxl::LEU2 mutant KK157 was complemented by a 
low copy number plasmid carrying the wild-type HSXl 
gene (pKK67), but not by a plasmid carrying a mutant 
hsxl(Mlu\*) gene (pKK68) (Table 3). 

The second transposition assay depends upon the 
ability of a pGTy/ helper plasmid to stimulate trans- 
position of a genomic Ty/mAw5AI element in trans 
(Curcio and Garfinkel 1992). Expression of the 
pGTyi-H3 helper plasmid increases the frequency of 
genomic Ty lmhis3Al transposition about 100-fold 
(Curcio and Garfinkel 1992; M.J, Curcio and D. 
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TABLE 3 



TyJmAu.? AI-270 transposition in hsxl mutant KK157 
containing plasmid copies of tRN A genes 



Plasmid (genotype) 


His* colonies/ 

t fit a I fv^lnni^t 

(xiO 8 ) 


Relative 
transposition 
efficiency 


pKK67 (HSXl) 


17/5.3 


6.4 X 10"* 




58/2.3 






16/3.6 






18/5.3 






23/4.1 




pKK68 [hsxl(MluI*)) 


0/4.5 


<4.9x 10-" 




0/4.6 






0/4.4 






0/2.3 






0/4.5 




pKK69 [SUP20I-0{VCU)] 


0/5.7 


<3.5X 10~ 9 




0/6.5 






0/5.2 






0/5.6 






0/5.9 




pKK71 [SUP20t-0-J{CCU)} 


4/7.5 


1.1 x 10"* 




3/5.6 






7/5.3 






11/5.9 






8/6.1 





The Ty/mAw.?AI-270 element is present in the hsxl::LEU2 strain 
KK157. The designated plasmids were introduced into strain 
KK157 and single transformants were chosen for further analysis. 
Refer to Table 2 for more information. 



J. Garfinkel, unpublished results). The pGTyi- 
USneo helper plasmid (Boeke, Xu and Fink 1988) or 
the control plasmid pGALl-lacZ (Bo eke et al, 1985) 
were introduced into isogenic strains JC344 {HSXl ) 
and KK157 (hsxl::LEU2) that also contain the chro- 
mosomal Ty/mAii5AI-270 element. Tyi transposi- 
tion was induced by growing the cells on SC-ura 
(galactose) plates and spliced TylmHIS3 transposition 
events were detected by replica plating onto SC-his- 
ura (glucose) plates (Figure 1). The HSXl strain 
DG1301 (containing the pGALl-lacZ control plasmid) 
gave rise to a few transposition events, while the HSXl 
strain DG1302 (containing the pGTyi-H3n*o helper 
plasmid) gave rise to hundreds of transposition events. 
In contrast, no Ty lmH!S3 transposition events were 
present in the hsxl::LEU2 strains DG1305 and 
DG1306, even though strain DG1306 contains a 
pGTyi-H3n*o helper plasmid that was induced for 
transposition. Since the ksxl::LEU2 mutation is reces- 
sive (Table 3), we showed that the pGTy7-H3neo 
helper plasmid is transposition-competent by testing 
pGTy/-H3nfo transposition in an hsxl :;LEU2f HSXl 
diploid strain (Table 4). 

Several controls were performed to determine 
whether the hsxl mutation directly affected the Ty 
transposition process or whether the hsxl mutation 
affected RNA splicing or Ty RN A levels. The splicing 
efficiency of the Ty/m/iw5AI-270 transcript varied 
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DG1301 : pGAL-lacZ 
HSX1 



DG1305 : pGAL-lacZ 
hsxi::LEU2 




DG1302: pGTy1-H3/ieo 
H$X1 



DGl306:pGTy1-H3m?o 
hsx1::LEU2 



Figure L— Ty/mAw3Al-270 transposi- 
tion in an hsxl mutant background. Strains 
DG1301. DC 1 302, DC 1 305 and DC 1 306 
contain the genomic Ty/mAijM 1-270 ele- 
ment. The relevant plasm ids and status of 
the HSXl gene are shown alongside the 
strains. These strains were tested for trans* 
position by growing cells on SC-uru (galac- 
tose) plates for 7 days at.20 a , replica plating 
to SC-his-ura (glucose), arid incubating the 
replicas for 3 days at 30*. 



TABLE 4 

Ty/nco transposition in hsxl/HSXl diploid strains 



Relevant genotype* 


Relative transposiiion- 
efficiency (&)* 




HSXl /HSXl 
hsxl::LEU2/HSXl 


42(15/36) 
47(16/34) 


190 



* Homozygous HSXl /HSXl diploids were obtained by mating 
strains DC 1302 and KD198-16A. Heterozygous hsxl::LEV2/HSXl 
diploids were obtained by mating strains DC 1 306 and KD 1 98- J 6 A. 

In this transposition test, the transposition efficiency is the 
number of C418 r . Ura" ptasmid segtegants divided by the total 
number of Ura~ plasmid segregunts. 

from 12 to 20% in both HSXl or hsxl::LEU2 strains 
as determined by RT-PCR. These splicing efficiencies 
agree with previous results where it was shown that 
about 12% of the Ty/mAijiAI transposition: events 
had lost the AI by splicing (CurciO and GarfinkEl 
1991). However, the overall Ty/ and Ty/mAwJAI- 
270 RNA levels were between 2- and 8-fold lower in 
an hsxI::LEU2 mutant background when compared 
with ACT J, PYKJ RNA or rRNA levels, although 
these differences were not completely reproducible; 

To determine whether this moderate decrease in 
the level of Ty RN A could account for the more than 
50-fold reduction in Ty / transposition, we assayed the 
level of pGTy /-H3ma<i*2AI retrotrarisposition (M. J. 
Curcio and f>. j. Garfinkfx, unpublished results) in 
an hsxl::LEU2 mutant. In collateral experiments, the 
level of pGTy/ expression in an ksxl::LEU2 mutant 
was determined by immunoblotting (see below). The 
efficiency of Tylmade2Al transposition was reduced 
almost 70-fold in an hsxl::LEU2 mutant background, 
while the level of GAL /-promoted Ty/ proteins re- 
mained unchanged in the mutant (Figure 2). A similar 
decrease in transposition was also observed when an 
HSXl strain containing a pGTy/A-Bnw plasmid with 
a mutation that corrects the frameshift was galactose- 
induced. Taken together, these results suggest that 
neither inhibition of splicing nor the lower concentra- 
tion of chromosomal Ty/ or Tylmhis3h\ RNA can 
completely account for the reduction of Ty/ trans- 



140- 



90- 




VLP 



1 2 3 4 5 6 



190- 




Ficure 2. — Immunoblot analysis of Ty/ proteins from an hsxf 
mutant background. Strains DG1302 (HSXl, pGTy/-H3n«>; lane 
1). DG1301 (HSXl, pGAL-lacZi lane 2), DC 1 306 (hsxl::LEU2. 
pCTy/-H3nw lane 3). DG1305 (ksxI::LEU2. pGAL-lacZ; lane 4). 
DC 1333 {HSXl. $GTyhmneo::$acl-1702\ lane 5). and DC 13 34 
{hsxi::LEU2, pCTy /-HSiwD-vSac/-/ 702, lane 6) were induced for 
transposition by' growth in SC-ura (galactose) medium and total 
protein was isolated for immunoblot analysis. Proteins were sept- 
rated by electrophoresis on an SDS-8% poiyacrybmide gel, trans- 
ferred to a nitrocellulose membrane, and cross-reacted whh B2 and 
VLP antisera The B2 antiserum detects p90-TyMN and its pre- 
cursors. The VLP antiserum detects p54 and p58, which are VLP 
structural proteins derived from TYAI , as well as pl90-TYAl- 
TYBl . The minor bands observed between p90-TyJ-!N and pl40- 
TYB1 are probably caused by cellular proteolysis because they are 
present in immunoblots prepared from a Ty PR nmtant (S- D. 
Yovncren and D. J. Garfinkei.. unpublished results). Ty/ protein 
size estimates (in KUoda lions) are indicated. 

position in an hsxl::LEU2 mutant. Previous analyses 
have shown that increased expression of tRNA- 
Arg(CCU) (HSXl) negatively regulates Ty/ transpo- 
sition (Xu and Boeke 1 990). Our results indicate that 
the HSXl gene is required for transposition of Ty/ 
elements. 
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Mature TYB1 proteins do not accumulate in an 
ksxl disruption mutant: To further investigate the 
inhibition of Tyi transposition by hsxl::LEU2, we 
compared the levels and processing of Tyi-encoded 
proteins in isogenic HSXI and hsxl disruption strains 
(Figure 2). Total protein was isolated from strains 
DG1302 (HSXI, pGTyi-H3n«; lane 1), DG1301 
(hsxl. pGALl-lacZ, lane 2), DG1306 (hsxl::LEU2 f 
pGTyl-mneo; lane 3), and DG1305 (hsxl::LEU2, 
pGALl-lacZ; lane 4) that were induced with galactose. 
The proteins were separated on SDS-poIyacrylamide 
gels and immunoblotted. The resulting filters were 
reacted with B2 antiserum, which reacts with the full- 
length 190-kilodalton (kD) TYA1-TYB1 precursor 
protein, the 160-kD and 140-kD processing interme- 
diates, and mature 90-kD Tyi-IN (Garfinkel et al. 

1991) or Tyi-VLP antiserum, which reacts with the 
58-kD TYA1 precursor protein and the mature 54- 
kD TYA1 product (Adams et al. 1987; Muller et al. 
1987; Youngren et al. 1988). Wild-type protein pat- 
terns were observed when the HSXI strain DG1302 
was analyzed with B2 or Tyi-VLP antiserum (lane 1), 
or with an antiserum (B8) that detects p60-Tyi-RT/ 
RH (B. Faiola and D.J. Garfinkel, data not shown). 
As expected, strains DG1301 (lane 2) and DG1305 
(lane 4) containing the heterologous expression plas- 
mid pGAL-lacZ had very low levels of Tyi proteins 
(Garfinkel et al. 1985; Curcio and Garfinkel 

1992) . 

The hsxl::LEU2 strain DG1305 (Figure 2, lane 3) 
displayed a different protein pattern when reacted 
with B2 and Tyi-VLP antisera. Essentially wild-type 
levels of the 1 90-kD TYA1-TYB1 precursor protein 
and 160-kD processing intermediate were detected 
using B2 antiserum. However, very little of the 140- 
kD precursor or 90-kD IN protein was detected. Sim- 
ilar results were obtained when an antiserum (B8) that 
detects RT/RH was used: the 1 90-kD and 160-kD 
TYB1 precursor proteins were present at wild-type 
levels, but the 140-kD precursor and the 60-kD Tyi 
RT/RH protein were barely detectable (B. Faiola 
and D.J. Garfinkel, data not shown). When TYA1 
proteins were analyzed with Tyi-VLP antiserum, nor- 
mal levels of mature p54-TYAl protein were ob- 
served in an ksxl mutant, but very little full-length 
p58-TYAl precursor was detected even after ex- 
tended exposure of the filter. Furthermore, similar 
protein patterns were observed when partially puri- 
fied Ty/-VLPs were reacted with B2, B8, or Tyi- 
VLP antisera (B. Faiola and D. J. Garfinkel, data 
not shown). These results suggest that the transposi- 
tion defect observed in ksxl mutants is related to 
aberrant protein processing. 

Tyi frameshifting increases in an hsxl disruption 
mutant: We tested whether the observed transposition 
defect in the hsxl mutant resulted from abnormal 



TABLE 5 



Translations! frameshifting in an hsxl mutant 



Relevant 




/9-Catactosidase Frameshifting 


genotype 


Framnhift site 


units 


efficiency {%) 


HSXI 


9merWT 


2400 






9merFusion 


6800 


35 




9merFusion(w/o AGG) 


8900 


27 


hsxl::HJS3 


9m34WT 


5100 






9merFusion 


5600 


91 




9merFusion(w/o AGG) 


6100 


84 



Strains KK242 (HSXI) and KK240 (ksxl::H!S3) were trans- 
formed with plasmids pMB-9merWT, pMB38-9mcrFusion, and 
pMB38-9merFusion(w/o AGG). S-Galactosidase activities are the 
averages from six independent transformants. The frameshift effi- 
ciency is defined as the 0-galactosidase activity of the 9merWT 
divided by the 0-galactosidase activity of either the 9merFusion or 
the 9merFusion(w/o AGG) (Bellcourt and Farabaugh 1990). 

TABLE 6 



Translations! frameshifting in an hsxl mutant KK240 
containing plasmid copies of tRNA genes 





Frameshifting 


Plasm id genotype 


efficiency (%) 


pKK73 (HSXI) 


35 


pKK74 [hsxl(Mlul*)] 


98 


pKK75 [SUP20I-0(VCU)] 


90 


pKK76 [SUP201-0-l(CCV)\ 


65 



Plasmids were introduced into strain KK240 (hsxl::HIS3) by 
transformation. Refer to Table 5 for experimental details. 



frameshifting using two different frameshifting assays. 
In the first assay, the HSXI strain KK242 and 
hsxl::HIS3 mutant strain KK240 were transformed 
with pMB38-9mer Fusion and pMB38-9merWT plas- 
mids in which the 0 (TYA1) and +1 (TYA1-TYB1) 
reading frames and lacZ are fused, respectively (Table 
5). 0-Galactosidase activity was determined from at 
least six different transformants of each plasmid and 
Tyi frameshifting efficiencies were calculated as de- 
scribed (see materials and methods; Belcourt and 
Farabaugh 1 990). A frameshifting efficiency of 35% 
was obtained in an HSXI background, which is com- 
parable to published values (Belcourt and Fara- 
baugh 1990). In contrast, the hsxl::HlS3 disruption 
resulted in 9 1 % frameshifting. The frameshifting ef- 
ficiency was restored to 35% by a low copy number 
plasmid carrying the wild-type HSXI gene (pKK73; 
Table 6). 

We also determined the Tyi frameshifting effi- 
ciency by quantitating the ratio of the unprocessed 
p58-TYAl precursor to the pl90-TYAl-TYBl pre- 
cursor in HSXI and hsxl::LEU2 strains DG1333 and 
DG1334, respectively (Figure 2, lanes 5 and 6). To 
insure that unprocessed precursor proteins accumu- 
lated during the galactose induction, strains DG1333 
and DG1334 contained a pGTyi-H3 plasmid with a 
well characterized TyJ-PR mutation, pGTyi- 
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H$neo::SacI- 1 702 (Youxgren et al 1988; Garfinkel 
et al. 1991; Curcio and Garfinkel 1992). Proteins 
were analyzed by immunoblotting using Ty7-VLP 
antiserum, which recognizes TYA1 proteins and the 
190-kD TYA1-TYB1 precursor protein (Adams et al 
1987; Muller et al 1987; Youngren et al. 1988), 
and frameshifting efficiencies were calculated by den- 
sitometry (see MATERIALS AND METHODS). 

The HSX1 strain DG1333 (Figure 2, lane 5) showed 
the pattern of unprocessed 58-kDa and 190-kDa pro- 
teins expected from a Tyi-PR mutant (Adams et aL 
1987; Muller et al. 1987; Youngren et al 1988). A 
frameshifting efficiency of about 3% was obtained 
from densitometry scans of various exposures of the 
immunoblot. In contrast, the hsxl::LEU2 strain 
DG1334 (Figure 2, lane 6) had much more of the 
190-kD TYA1-TYB1 precursor and slightly less of 
the 58-kD TYA1 precursor than the HSX1 parent 
strain DG1333 (Figure 2, lane 5). The h$xl::LEU2 
disruption mutant had a frameshifting efficiency of 
about 50%, which is about 17-fold higher than in an 
HSX1 background. The overall level of Tyi protein 
also appeared to be similar in the HSX2 or hsxl mutant 
backgrounds. These results suggest that the absence 
of tRNA-Arg(CCU) enhances ribosomal pausing at 
AGG and slippage of the leucyl-tRNA from CUU to 
UUA. Furthermore, the regulation of frameshifting 
by the HSX1 gene is essential for Ty/ transposition. 
The reduction in transposition in an hsxl mutant may 
be caused by a defect in protein processing that results 
from an aberrant stoichiometry of Ty proteins. 

The capacity to translate an AGG codon does not 
limit £-galactosidase synthesis in an hsxl mutant: 
The lacZ fusion gene in the pMB38-9merFusion plas- 
mid has only one AGG codon and it is located at the 
fusion site (Belcourt and Farabaugh 1990). That 
AGG codon is missing in the pMB38-9merFusion(w/ 
oAGG) lacZ fusion gene. Therefore, the effect of a 
single AGG codon on £-galactosidase synthesis was 
determined in an ksxl::HIS3 mutant. Interestingly, @- 
galactosidase activities in the hsxl::HlS3 mutant or the 
HSX1 parental strain harboring the pMB38- 
9merFusion and the pMB38-9merFusion(w/oAGG) 
plasmids were similar (Table 5). These results suggest 
that the capacity to translate the AGG codon does not 
limit £-galactosidase synthesis in an hsxl mutant. How- 
ever, we do not know how the AGG is translated in 
an hsxl mutant. Since haploid cells contain more than 
eight tRNA-Arg(UCU) genes (Beckmann, Johnson 
and Abelson 1977), it is possible that tRNA- 
Arg(UCU) decodes AGG codons by near-cognate rec- 
ognition when tRNA-Arg(CCU) is absent (Yokoyama 
etal 1985). 

Complementation of hsxl by a tRNA suppressor 
SUP201-0-1 (CCU): Although tRNA-Arg(UCU) may 
decode AGG codons, excess tRNA-Arg(UCU) does 
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not inhibit frameshifting (Belcourt and Farabaugh 

1990) . This may be because of sequence or structural 
differences between tRNA-Arg(UCU) and tRNA- 
Arg(CCU) (Figure 3). Alternatively, the information 
needed to regulate Tyi frameshifting may reside 
within the anticodon. To determine if the CCU anti- 
codon is sufficient to regulate Tyi transposition 
(Table 3) and frameshifting (Table 6), we constructed 
a low-copy-number plasmid carrying a mutant tRNA- 
Arg gene that has a CCU instead of a UCU anticodon. 
The SUP20 1-0-1 (CCU) anticodon mutation was intro- 
duced into the SUP201-0 tRNA-Arg(UCU) gene (Thi- 
reos, Penn and Greer 1984; Morishita and Uno 

1991) , by oligonucleotide mutagenesis (refer to ma- 
terials and methods). Functionally active tRNAs 
were synthesized from these plasmids because a plas- 
mid carrying the same 1 12-bp segment of DNA with 
a SUP201 nonsense suppressor complemented the 
cyrl-2 VGA allele (Morishita and Uno 1991; K. 
Kawakami and Y. Nakamura, unpublished results). 

To determine if SUP 20 1-0-1 (CCU) could suppress 
the transposition defect imposed by an hsxl mutation, 
strain KK157 containing Ty/mAw5AI-270 and 
hsxl:;LEU2 was transformed with the suppressor plas- 
mid pKK71 [SUP201-0-l(CCU)} or the parental plas- 
mid pKK69 [SUP201-0(UCU)]. The level of Tyi trans- 
position was partially restored when the pKK71 
[SUP20 1-0-1 (CCU)] plasmid was present in the 
hsxl::LEU2 mutant (Table 3). This result suggests that 
the CCU anticodon can regulate transposition. 

An hsxl::H!S3 mutant strain KK240 harboring plas- 
mids pMB38-9merWT or pMB38-9merFusion was 
transformed with plasmids pKK75 [SUP201-0(UCU)} 
and pKK76 [SUP 20 1-0-1 (CCU)} and frameshifting ef- 
ficiencies were analyzed in these transformants (Table 
6). The SUP201-0-l(CCU) mutant tRNA resulted in 
an intermediate level of frameshifting. Interestingly, 
frameshifting in the pKK76 [SUP20 1-0-1 (CCU)} trans- 
formant was higher (65%) than in the pKK73 [HSX1; 
tRNA-Arg(CCU)] transformant (35%). This result is 
consistent with the lower level of transposition of the 
pKK71 [SUP201-0-1(CCU)] transformant (1.1 X 10" 6 ) 
when compared to the pKK67 [HSX1; tRNA- 
Arg(CCU)] transformant (6.4 x 10 -6 ; Table 3). 
Therefore, although SUP20 1-0-1 (CCU) can partially 
regulate Ty transposition and frameshifting, it does 
not work as well as tRNA-Arg(CCU) encoded by 
HSXl. Other aspects of SUP20 1-0-1 (CCU) expression 
or structure may prevent full complementation of the 
hsxl mutation. These results also suggest that base 
pairing at the third position of the second codon in 
the frameshift heptamer is essential for regulating Tyi 
transposition and frameshifting. 

Increasing TYA1 expression restores Tyi trans- 
position in an hsxl mutant: Our results indicate that 
more of TYA1-TYB1 fusion protein is translated in 
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SUP201-0 GCUCCCGUGCCGUAAUGGaACGCGUCUGACUUCUAAUCAGAACAUUAUGGGUUCGACCCCCAUCGUUCUC 
* *• ••*•*•♦*•** •* ***■ ••«*•• • • «• • 

HSX1 GUUCCGUUGGCGUAAUGGUAA^GilCUCCCUCCUAAGGAGAAGACUGCGGGUUCGAGUCCCGUACGGAACG 
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Figure S.— Comparison of SVP2Q1-0 and 
HSXL Nucleotide sequences of the SUP201-O 
and HSX1 tRNAs are shown. Identical nucleo- 
tides are indicated by asterisks. Anticodons are 
in bold lettering. The clover leaf structure of 
the SUP201-0 and HSX1 genes, and the 
SUP201-0-1(CCU) mutation are also shown. 
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the hsxl disruption mutant, and that this altered stoi- 
chiometry of TYA1 to TYA1-TYB1 precursor pro- 
teins may inhibit Tyi transposition. Therefore, pro- 
viding more TYAI protein should restore Tyi trans- 
position in an hsxl mutant background. To test this 
idea, the efficiency of Tyi transposition was deter- 
mined with strains DG1344 (HSX1) and DG1347 
{hsxJ::LEU2) containing the helper plasmid pGTy- 
Klneo {PGKI ter.), and strains DG1301 (HSX1) and 
DG1305 (hsxl::LEU2) containing the heterologous 
expression plasmid pGAL-lacZ. The helper plasmid 
pGTyAineo (PGKI ter.) contains a complete TYAI 
gene, about 25 codons of N-terminal TYB1 sequence, 
the neo marker gene, and a transcriptional terminator 
from the PGKI gene in place of the downstream long 
terminal repeat. Liquid cultures of strains DG1301, 
DG1305, DG1344 and DG1347 were galactose-in- 
duced and transpositions of the chromosomal 
TyimAu5AI-270 element were selected on SC-his-ura 
(glucose) medium (Table 7). Although expression of 
the helper plasmid pGTyAl neo (PGKI ter.) did not 
markedly affect Tyi transposition in an HSX1 strain, 
expression of TYAI stimulated TyJ transposition 50- 
fold more in an hsxl strain than in an HSX1 strain. In 
addition, galactose-induction of pGTyAinw (PGK 
ter.) did not affect the level of full-length Tyi RNA 
in an HSX1 or hsxl disruption strain (B. Faiola and 
D. J. Garfinkel, data not shown). As expected, the 
level of the Ty A 1 neo (PGK ter.) transcript was the 
same in the HSX1 and hsxl::LEU2 mutant strains. 
These results indicate that expression of pGTyAineo 



TABLE 7 



Effect of pGTy A/ neo{PGKl ter.) expression on Ty/mA£r?Al- 
270 transposition in an hsxl mutant 



Strain 


Relevant genotype 


Relative 
transposition 
efficiency 


DG1301 


HSXLpGALl-lacZ 


1.3 X 10" 7 


DG1344 


HSX1, pGTyA tneo{PGKl ter.) 


4.8 X 10" 7 


DG1305 


hsxl::LEU2, pGALl-lacZ 


<3.7 x icr 9 


DG1347 


hsxl::LEU2, pGTyA / neo(PGK 1 ter.) 


2.4 X 10~ s 



These strains contain the genomic TyVmAii.?AI-270 element. 
Relative transposition efficiencies were determined from liquid 
cultures grown in SC-ura (galactose) as described in materials and 
methods. The relative transposition efficiency is the number of 
His + , Ura + colonies divided by the number of Ura + colonies. Each 
measurement represents the mean of four cultures. The total num- 
ber of colony-forming units was similar within each set of cultures. 
Refer to Table 2 for more information. 

(PGKI ter.) stimulates Tyl transposition in an hsxl 
mutant background by restoring the proper stoichi- 
ometryofTYAl toTYAl-TYBl precursor proteins. 

DISCUSSION 

Our study reveals that the HSX1 gene is necessary 
for Tyl transposition because elimination of this gene 
causes a significant transposition defect. Our work 
also shows that Tyi translational frameshifting in- 
creases dramatically in an hsxl disruption mutant. The 
hsxl mutant defects in frameshifting and Tyi trans- 
position are completely complemented by the wild- 
type HSX1 gene and partially complemented by the 
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mutant SUP20 1-0-1 (CCU) gene, while no complemen- 
tation occurs with the SUP20 1-0(UCU) gene (Tables 3 
and 6). Therefore, at least some of the information 
required for Tyi frameshifting is provided by the 
CCU anticodon. The partial complementation activity 
of the mutant SUP201-0-1 tRNA-Arg(CCU) suggests 
two possibilities. First, SUP 20 1-0-1 (CCU) may be ex- 
pressed at a lower level than HSX1, thus directly 
affecting the level of tRNA-Arg(CCU) available for 
frameshifting. Second, SUP20 1-0-1 (CCU) may not rec- 
ognize the AGG codon within the context of the 
frameshift heptamer as well as HSX1, since the 
SUP201-0-1 and HSX1 tRNA genes differ by 20 nu- 
cleotide changes (Figure 3). Raftery and Yarus 
(1987) have shown that the structure of the proximal 
anticodon stem affects efficiency of a tRNA suppres- 
sor of E. colt and suggested that it is a part of the 
extended anticodon. The 2-bp difference in the anti- 
codon stem between SUP20 1-0-1 and the HSX1 tRNAs 
may result in the altered AGG codon recognition 
activity of the SUP20 1-0-1 tRNA. 

Both — I and +1 frameshifting mechanisms used by 
a variety of RNA viruses, retroviruses, and retrotrans- 
posons apparently require a translational pause for 
optimum efficiency (reviewed by Hatfield et al. 1992). 
For example, the translational pause in retroviral —1 
frameshifting is created by a pseudoknot located a few 
nucleotides downstream of the frameshift, whereas 
Tyi +1 frameshifting uses a the rare tRNA- 
Arg(CCU). Our results are consistent with the +1 
frameshifting model proposed by Belcourt and Far- 
abaugh (1990). According to this model, the increase 
in + 1 frameshifting results from a longer translational 
pause in an hsxl mutant created by the absence of 
tRNA-Arg(CCU). The longer translational pause reg- 
ulates translation of TYBl-pol by allowing more time 
for the tRNA-Leu to slip from the 0-frame CUU 
codon in TYA1 to the +1 -frame UUA codon in TYB1. 

Two different approaches were used to estimate the 
increase in frameshifting that occurs in an hsxl dis- 
ruption mutant. First, frameshifting was measured 
using the minimal heptamer sequence with lacZ as a 
reporter gene (Belcourt and Farabaugh 1990). 
The absence of tRNA-Arg(CCU) increased frame- 
shifting as measured by 0-galactosidase activity about 
3-fold. Second, frameshifting was measured by im- 
munoblotting using Tyi-VLP antiserum and a Tyi- 
PR mutant defective in protein processing. The in- 
crease in frameshifting at the CUU-AGG-C sequence 
leads to accumulation of slightly less p54-TYAl pro- 
tein and much more pl90-TYAl-TYBI fusion pro- 
tein. Using this assay, frameshifting increased about 
17-fold in an hsxl background. 

We estimate that Tyi frameshifting occurs at about 
a 3% efficiency in an HSX1 background by immuno- 
blotting. In other words, 3% of ribosomes translating 



317 

the TYAl-gag open reading frame undergo a +1 
frameshift and continue translating the TYBl-pol open 
reading frame. It is somewhat surprising that the Tyi 
frameshifting efficiency of $% is about 5-10-fold 
lower than that obtained by lacZ fusion analysis. It is 
possible that we have underestimated the Tyi frame- 
shifting efficiency obtained from immunoblotting be- 
cause of an inability to detect the pl90-TYAl-TYBl 
precursor protein. However, control experiments sug- 
gest that p58-TYAl and pl90-TYAl-TYBl are trans- 
ferred at about the same rate under the immunoblot- 
ting conditions used in this study, bind to Tyi-VLP 
antiserum with comparable affinities, and have similar 
turnover rates (A.-M. Hedge and D. J. Garfinkel, 
unpublished results; Curcio and Garfinkel 1992). 
There may also be differences in translation rates of 
lacZ in yeast, or in the placement of the frameshift 
heptamer relative to the start of translation tliat con- 
tribute to this apparent discrepancy (P.J. Farabaugh, 
unpublished results). 

The Tyi frameshifting efficiency of 3% obtained 
by immunoblot analysis is comparable to the efficien- 
cies obtained from several viral systems that utilize 
different mechanisms for translation of the pol gene. 
Retroviruses that utilize programmed ribosomal 
frameshifting or read-through suppression undergo 
translational suppression at an efficiency of about 5% 
(reviewed by Hatfield et al. 1992). Yeast TyS retro- 
transposons have a +1 frameshifting efficiency of 
about 4%, even though these elements use a different 
frameshifting site (Kirchner, Sandmever and For- 
rest 1 992) and mechanism than Tyi or Ty2 elements 
(P. J. Farabaugh, unpublished results). In addition, 
the yeast L-A double-stranded RNA virus undergoes 
—1 frameshifting to express its pol gene at an effi- 
ciency of about 2% (Dinman, Icho and Wickner 
1991). Even though the molecular mechanisms un- 
derlying these expression strategies are quite differ- 
ent, a certain ratio of "structural" (Gag) proteins to 
"catalytic" (Gag-Pol) proteins may be a general re- 
quirement for formation of a transposition/replica- 
tion-competent particle. 

Immunoblot analysis suggested that a processing 
defect of the TYA1-TYB1 fusion protein is related to 
the lower level of Tyi transposition in an hsxl disrup- 
tion mutant. The protein cleavages required to form 
p54-TYAl and thepI60 processing intermediate still 
occur, while the proteolytic cleavage required to con- 
vert the pi 60 processing intermediate to p23-PR and 
the pi 40 processing intermediate apparently do not 
(Figure 4). Since formation of pl40-TYBl is defec- 
tive, it follows that low amounts of mature IN and 
RT/RH are detected in an hsxl::LEU2 mutant. Per- 
haps Tyi-PR is not completely activated when more 
of the TYA1-TYB1 fusion protein is produced. Alter- 
natively, normal amounts of pi 40, IN and RT/RH 
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Figure 4. — Scheme for TYB1 protein processing in an hsxl 
mutant(modinedfromGARFiNkEL^a/. 1991), pl90-TYAl-TVBl, 
the 190-VD product of the gag~TYAl and pol-TYBJ genes, is cleaved 
near the frameshift region (the vertical line separating TYAl and 
TYB1). This proteolytic cleavage releases pl60-TVBI, which is 
normally cleaved to form Ty/-PR (23 JcD) and p.l40rTYBl. Cleav- 
age of pMO-TYBl produces IN (90 kD) and RT/RH (60 kD). The 
dotted lines indicate that pi 60 and p23 may be encoded by both 
TYAl and TYBl. The arrows show that neither the pHO-TYBl 
percursor nor mature p90-TyMN and p60-Ty /-RT/RH accumu- 
late in an hsxl mutant. Also shown is the p58-TYAl precursor and 
p54 processed product, which are the major structural components 
of TyJ-VLPs. In an hsxl mutant, we detect p54.-TYAl but not the 
p58-TYAI precursor. 

may be synthesized, but are rapidly degraded because 
of an hsx 1 -dependent defect in Tyi-VLP assembly. 

To prove that aberrant protein stoichiometry is the 
major reason for the block in Tyi transposition in an 
hsxl disruption mutant* we showed that a pGTy/ 
plasmid expressing just the TYAl gene not only re- 
stores Tyi transposition in an hsxl::LEU2 mutant, but 
stimulates transposition to a level 50-fold higher than 
is observed in an HSXl strain. We also showed that 
overexpression of TYA I does not alter the level of 
Tyi RNA in an hsxl mutant. These results suggest 
that overexpression of TYAl enhances the utilization 
of Tyi RNA as a transposition template by rebalanc- 
ing the level of TYAl and TYAl-TYBl proteins 
required to make transposition-competent Tyi-VLPs 
in an hsxl mutant, even though the absolute level of 
Tyl RNA is somewhat lower in the hsxl mutant. 
Furthermore, since GAL i -promoted Tyi transposi- 
tion decreases about 70-fold without a concomitant 
decrease in GAXJ-promoted Tyi protein levels in an 
hsxl disruption mutant, whatever effect the hsxl mu- 
tation has on Tyi RNA levels is limited to chromo- 
somal Tyi elements. These results suggest that the 
hsxl mutation may affect chromosomal Tyi RNA 
accumulation, but we have not investigated this idea 
further. 

The stimulation of Ty transposition that occurs in 
an hsxl mutant when TYAl is overexpressed supports 
and extends previous biochemical and genetic studies 
that identified the availability of Tyi-PR, which is 



encoded by TYBl, as a rate-limiting step in the Tyi 
retrotransposition cycle (Curcio and Garfinkel 
1992). Since more TYAl-TYBl precursor protein is 
made in an hsxl mutant, the availability of TYAl 
protein becomes rate-limiting under these conditions; 
Therefore, a specific ratio of TYAl to TYAl-TYBl 
precursor proteins is required to form fully processed 
Tyi proteins and functional Tyi-VLPs. 

Several retrovirus, retro transposon and endoge- 
nous viral mutants in which gag and pol have been 
artificially fused are defective in particle formation, 
replication and infectivity. For example, fusion of gag 
and pol genes blocks production of infectious Moloney 
murine leukemia virus (Felsenstein and Goff 1988) 
and human immunodeficiency virus (Park and Mor- 
row 3 992). In Moloney murine leukemia virus, the 
Gag-Pol precursor protein is produced, but neither 
protein processing nor particle formation occurs, in 
human immunodeficiency virus, the Gag- Pol protein 
is produced and processed, but particles do not form. 
A protein processing and transposition defect similar 
to the one created in an hsxl mutant is observed when 
TYAl and TYBl are fused by deleting one base at the 
frameshift site of a pGTy i plasmid and transposition 
is galactose-induced in an HSXl strain. Preliminary 
experiments suggest that Tyi-VLPs are formed in an 
hsxl mutant (B. Faiola and D.J. Garfinkel, unpub- 
lished results) or when just the TYAl-TYBl fusion 
protein is expressed (J. D. BOEKEandD.J. Garfinkel, 
unpublished results). Recently, a Ty3 GAG3 P0U 
fusion mutant has been analyzed for defects in trans- 
position and Tyi?-VLP formation using a pGTy.? 
expression system (Kirchner, Sandmeyer and For- 
rest 1 992), The fusion mutant is transposition-defec- 
tive, but can be rescued by coexpression of GAG3 or 
just the capsid domain of GAG3. Protein processing 
of GAG3 capsid protein and TyjMN is altered in the 
mutants, as is individual TyJ protein and Tyi-VLP 
yield. Optimal ribosomal frameshifting and the proper 
Gag to Gag- Pol protein ratio are also required for L- 
A virus propagation in yeast (Dinman and Wickner 
1992). Therefore* Tyi and Tyi elements seem to be 
unique in that some particle assembly can take place 
when excess Gag-Pol precursor protein is synthesized 
(Kirchner, Sandmeyer and Forrest 1992) when 
only Gag protein is synthesized {ADAMS et al. 1987; 
Burns et aL 1992), or when PR-dependent protein 
processing is blocked (Adams et at. 1 987; Muller et 
al. 1987;, Youngren etal. 1988; Kirchner and Sand- 
meyer 1993). 

In summary^ our work has identified an essential 
role for HSXl in Tyi frameshifting and transposition. 
This is one of a small but growing collection of cellular 
genes required for Tyi transposition that act post- 
transcriptionally (reviewed by BoEKE and Chapman 
1991 ; Garfinkel 1 992). The additional defects of an 
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hsxl disruption mutant (Kawakami et al. 1992) may 
allow us to select second-site suppressors that restore 
Ty 1 transposition without affecting Ty / frameshifting 
mediated by tRNA-Arg. These suppressors may iden- 
tify additional cellular genes involved in Tyi frame- 
shifting or Tyi-VLP assembly. 
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A TnlO insertion affecting SEF14 fimbrial synthesis in Salmonella enteritidis 
was located 13 bp upstream of a gene designated flmU. The 77-bp DNA 
sequence of fimll from S. enteritidis was identical to that of flmU encoding 
tRNA^S (UCU) from Salmonella typhimurium and 96% identical to that of 
the Escherichia coli argU homolog. Furthermore, the open reading frame 
adjacent to and overlapping the 3* end of fimU was similar to the prophage 
DLP12 integrase gene. The^m^-encoded transcript comigrated with total cellular tRNA and was 
predicted to form a tRNA-like cloverleaf structure containing the arginine anticodon UCU. Thu^fimU 
encoded a tRNA Ar S specific for the rare codon AG A. fimll mapped to the SEF21 fim operon located 
l!> C's from the sefl4 gene cluster. Although^ U was located within the SEF21 fim gene cluster, the 
fimUTnlO insertion mutant of S, enteritidis was found to be defective in SEF14 as well as SEF21 (type 
1) fimbria production. SEF17 and SEF18 fimbria production was not affected. Complementation of this 
mutant with plasmid-borne^mf/ restored normal production of the fimbrins SefA and FimA as well as 
their respective fimbriae SEF14 and SEF21. This is the first description of tRNA simultaneously 
controlling the production of two distinct fimbriae. 

► INTRODUCTION 



Regulation of fimbria biosynthesis in bacteria is multifactorial and complex. 
In Escherichia coli, the expression of type 1 fimbriae is transcriptionally 
regulated in part by an inversion-dependent, phase-variable mechanism that 
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involves two site-specific recombinases (17, 24, 27) and a tRNA Leu molecule 
(32). tRNA Leu , specific for the rare leucine codon UUG, stimulates type 
1 fimbria synthesis by influencing the switch from phase off to phase on (35). 
Recently, type 1 fimbria expression in Salmonella typhimurium has been 
shown to be regulated by mechanisms that are different from those controlling type 1 fimbria expression 
in E. coli (41). Ho wever, a common regulatory theme does exist in that a tRNA. specific for the rareT ^ 
arginine codons AGA and AGG, is required (40) . Swenson et aj^(jQ) Suggest that the amount of 
tRNA Ar S (UCU) available in S. typhimurium may influence the expression of three genes encoding 
regulatory proteins of the fim gene cluster, since in each of these genes there is a high frequency of rare 
AGA codons recognized by tRNA Ars (UCU). 

Salmonella enteritidis 27655-3b produces at least four fimbrial types: SEF17 (10), SEF18 (6), SEF21 
(type 1 fimbriae) (30), and SEF14 (7, 14). Although little is known about how the expression of the 
operons is regulated, SEF21 and SEF14 fimbriae are produced under similar environmental conditions 
(5, 12). Thus, the question arises as to whether or not their expression is coregulated. In a previous 
study, a TnlO insertion mutant, S. enteritidis 3b-122, was generated which no longer produced SEF14 
fimbriae and carried the transposon outside of sefA, the structural gene for these fimbriae (14). Further 
characterization of 3b- 122 in this study indicated that this mutant was also defective in type 1 fimbria 
(SEF21) production, suggesting that the TnJO interrupted a gene whose product coregulated the 
expression of both SEF14 and SEF21 fimbriae. The results of this study show for the first time that the 
production of two fimbriae is coregulated by the same tRNA. 

► MATERIALS AND METHODS 
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Bacterial strains, plasmids, media, and growth conditions. S. enteritidis 
2 765 5 -3b, originally isolated from human feces, was provided by 
T. Wadstrom (University of Lund, Lund, Sweden). S. enteritidis 27655-3b- 
122, a TnlO insertion mutant of the parent strain, was constructed by Feutrier 
et al. (14). E. coli DH5oc and S. enteritidis 3b- 122 were used as hosts for pSFA 
(11), pLU/TA 4-1 , and pGEM-Tl . To create pLU/TA 4-1, PCR-amplified 
fimU was cloned into pGEM-T (Promega Corp.), a TA cloning vector containing 3'-terminal thymidines. 
To create pGEM-Tl, the 3'-overhanging thymidines of pGEM-T were filled in with dATP and T4 DNA 
polymerase prior to ligation (38). 

Bacteria were grown at 37°C with shaking in Luria-Bertani (LB) broth (36) supplemented with 

ampicillin to a final concentration of 250 ug/ml except where noted. To analyze the production of 

fimbriae by S. enteritidis, the cells were grown in various liquid media under different growth conditions 

(Table 1). Cultures grown in LB broth and terrific broth (TFB) (38) were transferred to ice 24 h after 

inoculation, whereas cultures grown in colonization factor antigen (CFA) medium (.13) and T broth (10) 

were transferred to ice 48 h after inoculation. All the cultures were standardized to an optical density at 
630 nm (OD 630 ) of 1. 



TABLE 1. Production of SefA, FimA, AgfA, and SefD fimbrins by 
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View this table: S. enteritidis 3b and S. enteritidis 3b- 122 
[in this window ] 
[ in a new window ] 



Subcloning TnlO from S. enteritidis 27655 3b-122. S. enteritidis 3b-122 chromosomal DNA was 
isolated by the method of Aim et al. (1), purified by CsCl centrifugation (38), and digested with HindllL 
To subclone the Tn70-containing chromosomal DNA fragment, size fractionated Hindlll fragments (2 to 
3 and 3 to 5 kb) were purified from an agarose gel with Sephaglas (Pharmacia Biotech), ligated to 
///wdlll-digested and -dephosphorylated cloning vector pTZ19R, and then introduced into E. coli DH5cc 
by transformation (38). A total of 2,880 colonies grown on Hybond N + membranes (Amersham) were 
screened by hybridization to the oligonucleotide probe Tn/0 IS10L+R (5' 

GCAGAATTGGTAAAGAGA 3'). This probe, complementary to the sequence located 1 34 bp inside 
the insertion sequence of Tn/0 9 was used to identify Tn70-containing clones. The probe, end labelled 
with [7- 32 P]ATP, was hybridized to the membranes at 45°C in prehybridization buffer (38) containing 
200 |xg of herring sperm DNA (Sigma)/ml. Following hybridization, the membranes were washed in 
0.2x SSC buffer (1 x SSC is 0.15 M NaCl plus 0.015 M sodium citrate) containing 0.1% sodium dodecyl 
sulfate (SDS) at 45°C, and the results were recorded by autoradiography on Kodak BioMax film. 

DNA sequencing and computer analyses. The Tn70-positive clones and the three fimUVCR products 
amplified with primers located outside the flmU gene were sequenced with Sequenase version 2 (United 
States Biochemicals). The custom oligonucleotide primer Tn70IS10L+R was synthesized on a PCR- 
MATE EP model 391 DNA synthesizer (Applied Biosystems Inc.). The DNA sequences obtained were 
analyzed with DNA Strider 1.1 (26). Similarity searches of the National Center for Biotechnology 
Information (NCBI) databases were conducted with the program BLASTN (2). 

PCR amplification oiflmU, Custom oligonucleotide primers fimULT 
(TAATAGCGATACGCAGAATTCAAAAATATCCTACACGGCAGG) andy?m£/LB 
(CAGATATGCTCACCTAAGCTTTAATCATTTAACGGAACACGG) were designed based on the 
S. typhimurium chromosomal DNA sequence flankingy?/wt/and were synthesized by Gibco BRL.fimU 
was PCR amplified from a previously prepared cosmid clone, pPB523 (12), with flmULJ and flmULB. 
To facilitate the cloning of the amplified product, the primers were designed to contain an £coRI site 
and a Hindlll site, respectively (underlined). Amplification was carried out in a 100-ul reaction volume 
containing 10 ul of pPB523 (0.01 ug/ml), 25 pmol of each primer, the four deoxynucleotide 
triphosphates (Boehringer Mannheim) at 0.5 raM each, and 2 U of Taq DNA polymerase (Boehringer 
Mannheim) in reaction buffer consisting of 50 mM Tris-HCl (pH 8.5), 20 raM KC1, 2.5 mM MgCl 2 , and 
0.5 mg of bovine serum albumin/ml. The Taq enzyme was added after an initial 3-min denaturation step 
at 95°C (4). Thermocycling was performed in a PTC-100TM Programmable Thermal Controller (MJ 
Research Inc.) as follows: 1 cycle of 75°C, 1 min; 50°C, 2 min; 74°C, 2 min and 30 cycles of 95°C, 
1 min; 50°C, 1 min; 74°C, 2 min, followed by an 8-min elongation at 74°C. 

Subcloning PCR-amplifiedyi/wC/. PCR-amplified fimUv/as purified from a 1% agarose gel with 
Sephaglas, ligated to pGEM-T according to the manufacturer's instructions (Promega Corp.), and then 
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transformed into E. coli DH5cc (38). 

Mapping oifimU on genomic restriction maps of Salmonella and E. coli strains. The flmU gene was 
mapped as previously described for the four fimbrin genes sefA, agfA^fimA, and sefD (8). The flmU 

probe, prepared by running EcoRA- and ///milll-digested pLU/TA 4-1 on a 1% agarose gel and purifying 

the fragment with Sephaglas, was labelled with [cc 32 P]dATP (Pharmacia Biotech) by nick translation. 

The radiolabeled fimU probe was hybridized to nitrocellulose blots containing Xbal- and £M-digested 

E. coli, S. typhimurium, and S. enter itidis genomic DNA separated by pulsed field gel electrophoresis 

(blots were provided by K. Sanderson and S.-L. Liu; see reference 8). 

RNA extraction and Northern blot analysis. Total RNA was prepared from S. enteritidis 3b } 3b- 122, 
3b-122 pGEM-Tl, and 3b-122 pLU/TA 4-1 grown statically in LB or CFA broth at 37°C for 45 h by a 
modification of the procedure of McCormick et al. (28) as described in Clouthier et al. (7). For fimU 
transcript analysis, the RNA was separated on a 10% polyacrylamide gel containing 8 M urea and 
transferred onto Hybond N + membranes (Amersham) with transfer buffer (0.025 M phosphate buffer 
[pH 6.5]) and an LKB Pharmacia semidry blotting apparatus. For sefA transcript analysis, the 
electrophoretic separation of total cellular RNA and its subsequent transfer to Hybond N + membranes 
(Amersham) were performed as described in Fourney et al. (15). The flmU- and se/4-specific probes 
used for Northern blot analysis were gel purified from EcoKI and Hindlll digests of pLU/TA 4-1 and 
pSFA, respectively, with Sephaglas. The probes were labelled with [cc 32 P]dATP (Pharmacia Biotech) by 
nick translation and hybridized to the blots at 65°C for 18 h in the presence of 200 ug of herring sperm 
DNA (Sigma)/ml. The membranes were washed at high stringency (0.2* SSC buffer-0.1% SDS, 65°C), 
and the results were recorded on Kodak BioMax or X-Omat AR5 film. 

SDS-PAGE and Western blot analysis. Whole-cell lysates of S. enteritidis 3b or clones of this strain 
were screened for the presence of four fimbrial types. SEF14, 18, and 21 were solubilized from whole 
cells with SDS-polyacrylamide gel electrophoresis (PAGE) sample buffer supplemented with 0.2 M 
glycine (pH 2, 100°C, 10 min), whereas SEF17 fimbriae were solubilized from whole cells with formic 
acid according to the method of Collinson et al. (9, 10). A portion of each culture (1 OD m unit) was 

resuspended in 200 ul of sample buffer, and 10 ul (0.01 OD 630 unit) was loaded per lane. Proteins in 
these samples were separated by SDS-PAGE, electrophoretically transferred to nitrocellulose, and 
screened with rabbit polyclonal anti-SEF14 (7), SEF17 (10), SEF18 (6), or SEF21 immune serum (30). 
Immunoreactive proteins were detected with goat anti-rabbit immunoglobulin G-alkaline phosphatase 
conjugates (Cedarlane) and visualized with 5-bromo-4-chloro-3-indolylphosphate and Nitro Blue 
Tetrazolium (Sigma). 

Electron microscopy. SEF14 and SEF21 fimbriae on S. enteritidis 3b, 3b-122, 3b-122 pGEM-Tl, and 
3b- 122 pLU/TA 4-1 were immunogold labelled with SEF14- or SEF21 -specific rabbit polyclonal 
immune sera followed by incubation with protein A-15-nm-diameter gold particles (Cedarlane). 
Negative staining was performed as described previously (10). 

Nucleotide sequence accession number. The nucleotide sequence reported herein for fimU has been 
submitted to GenBank and has been given the accession number AF013136 . 
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► RESULTS 



Fimbria production in S. enteritidis 3b-122. Production of SEF 14, -17, - 
18, and -21 fimbriae by the S. enteritidis TnlO mutant 3b- 122 grown under 
various growth conditions was assessed by Western blotting with fimbria- 
specific antisera, and the results were compared to those obtained with the 
wild-type strain S. enteritidis 3b. The TnlO mutation in 3b- 122 had a 
pronounced effect on SEF14 and SEF21 production but little or no effect on 
SEF17 and SEF 1 8 production (Table 1). As previously reported, SEF14 fimbriae were not expressed by 
3b- 122 grown in static CFA broth at 37°C (Table 1). Further characterization in this study, however, 
showed that 3 b- 122 lost SEF 14 expression under all the growth conditions in which 3 b was SEF 14 
positive (Table I). In addition to the SEF14-negative phenotype, 3b-122 was also defective for type 
1 fimbria (SEF21) production. The wild-type strain produced FimA under all growth conditions tested, 
whereas 3b- 122 only produced FimA in CFA broth cultures. Thus, the result of the TnlO insertion was 
the complete loss of SEF 14 expression under all growth conditions and selective loss of SEF21 
expression under certain growth conditions. The altered production of SEF 14 and SEF21 fimbriae in the 
TnJO insertion mutant relative to the wild-type expression patterns suggested that the transposon 
insertion interrupted a gene whose product was required for both SEF 14 and SEF21 fimbria expression. 

Identification of the Tn/0 insertion site in S, enteritidis 3b-122. To determine the TnJO insertion site, 
#z>idIII fragments of 3b-122 chromosomal DNA were subcloned into pTZ19. Clones containing TnlO 
were identified with the probe IS10L+R, which hybridized within the insertion sequence located at 
either end of the transposon (Fig. 1 A). Of the 17 Tn70-positive clones identified, 3 were subjected to 
DNA sequence analysis. Comparison of the 3b-122 DNA sequence flanking TnlO to sequences listed in 
the NCBI databases revealed that the sequence was 99% identical to that of the region located between 
flmW and flmU of the S. typhimurium type 1 fimbria! gene cluster. Thus, on the basis of DNA sequence 
comparison, TnlO was inserted 13 bp upstream of the predicted start site of the gene, which will 
hereafter be referred to as flmU (Fig. 1 A and B). 

FIG. 1. Location of TnlO on the S. enteritidis 3b-122 
chromosome and identification of the genes flanking the TnlO 
insertion in S. enteritidis 3b. (A) Schematic diagram of 
S. enteritidis (S.e.) 3b- 122 chromosomal DNA (black line) 
showing the TnlO insert and the strategy used to obtain the 
chromosomal DNA sequence adjacent to one side of this 
insert. A 3-kb Hindlll fragment comprising 3b- 122 
chromosomal DNA fused to one end of TnlO was identified 
by hybridization with the TnlO oligonucleotide IS10L+R. 
IS10L+R was also used as a sequencing primer to obtain 
240 bp of DNA sequence from the subcloned Hindlll 
fragment. (B) Schematic diagram of the S. typhimurium (S.L) 
chromosome (black line) between flmW and flmU of the type I 
fimbrial gene cluster that was homologous to the 240 bp of 
S. enteritidis 3b- 122 DNA sequence. Two 42-bp 
oligonucleotide primers, flmULT and flmULB (horizontal 
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arrows), were made based on the & typhimurium sequence 
previously deposited in GenBank (LI 93 3 8) by Swenson and 
Clegg (39). (C) Segment of the S. enteritidis 3b chromosome 
(black line) amplified by PCR with the primers fimULT and 
fimULB. This amplified DNA segment was subcioned and 
sequenced (Fig. 2) to identify the DNA flanking the TnlO 
insert. The TnlO insertion point (vertical arrow) was 
determined to be between the -10 region and the start of the 
fimU gene. The presence of flmW,flmU, and the -35 region 
on the 3b chromosome is also noted. 



DNA sequence analysis of flmU subcioned from S. enteritidis 3b. By using primers fimULT and 

flmULB designed from the sequence flanking the fimU gene in S. typhimurium, a 490-bp fragment was 

PCR amplified from the cosmid clone pPB523-G containing 35 kb of S. enteritidis 3b DNA (Fig. IB). 

The flmUVCR. product was subcioned into vector pGEM-T (Fig. 1 B). Nucleotide sequence analysis of 

three clones revealed a potential promoter, but a putative translated protein could not be detected by 

open reading frame analysis. Comparison of the DNA sequence downstream of the potential promoter to 

sequences listed in the NCBI databases showed that the sequence was identical to that of fimU of 

S. typhimurium (Fig. 1 C and 2B) and 96% similar to that of argU/dnaY of E. coli (Fig. 2A). These genes 

encode arginine-specific tRNAs that recognize the rare AGA codon. The nucleotide sequence of flmU 

from S. enteritidis 3b contained 4 inverted repeats, which were predicted to fold the sequence into the 

characteristic tRNA-like cloverleaf structure (Fig. 2B) with UCU in the expected tRNA anticodon 

position. Together, these data suggested that fimU from S. enteritidis 3b encoded an arginine-specific 
tRNA. 




View larger version (22K): 
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FIG. 2. Sequence comparison of flmU from S. enteritidis 3b (S.e.) 
with flmU of 5. typhimurium (S.t.) and argU of E. coli {Ex.) as 
well as the predicted fimUKNA secondary structure. (A) 
Alignment of S. enteritidis flmU DNA sequence with both the 
S. typhimurium fimU (39) and E. coli (31) argU gene sequences. 
Symbols: DNA sequence identity; -, gaps introduced to 
maximize homology; +, bases constituting the -35 and -10 boxes; 

bases constituting the anticodon; V, position of the Tn/0 
insertion on the S. enteritidis 3b- 122 chromosome. The DNA 

sequence corresponding to the proposed mature tRNA Arg (UCU) is 
underlined. (B) Diagram of the proposed secondary structure for 

tRNA Ar S (UCU) from S. enteritidis 3b. The anticodon bases are 
underlined. 



Analysis of the nucleotide sequence downstream of fimU revealed an open reading frame oriented in the 

opposite direction such that the 3' ends of the two genes overlapped. The predicted amino acid sequence 

was 88% similar to that of the prophage DLP12 integrase of E. coli. The sequence further downstream 

of fimU displayed 60 to 88% similarities to those encoding transposases of the IS5 family of insertion 
elements. 
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Mapping/?mZ7 on the S. enteritidis 3b genome. Like JimA,fimU was localized to chromosomal Xba\ 
and Blnl fragments in the 98.5- to 13.0-C's region of the chromosome in both Salmonella serovars. By 
using a series of S. typhimurium and S. enteritidis TnlO mutants the fimU gene was more precisely 
mapped to between purE884::Tnl 0 at 12.6 C's and the first Xbal restriction site at 13.6 C's in 
S. enteritidis or 13.0 C's in S. typhimurium. Thus, fimU mapped to the same region shown previously to 
contain the flmA gene in the fim gene cluster. 

Analysis of fimU transcription. To determine if the fimU transcript was the same size as tRNA, a 

y?mt/-specific probe was hybridized to a blot containing total RNA isolated from S. enteritidis 3b, 3b- 

122, 3 b- 122 pGEM-Tl, or 3b- 122 pLU/TA 4-1 grown under conditions optimal for type 1 fimbria 

(SEF21) production in S. enteritidis 3b (static LB broth, 48 h, 37°C). The fim ^-specific probe 

hybridized to a transcript that was present in total RNA from 3b and 3b- 122 pLU/TA 4-1 (Fig. 3). The 

flmU transcript was consistently difficult to detect on Northern blots of 3b RNA even with excessive 

amounts of RNA loaded on the gels (25 ug [Fig. 3, lanes 1 to 4] and 50 ug [Fig. 3, lanes 5 to 8]) and 

extended exposure of the blots to X-ray film. Although the transcript was not found in RNA prepared 

from 3b- 122, trace levels were evident in 3b- 122 carrying the vector pGEM-T (Fig. 3), but the transcript 

was even more difficult to detect than its counterpart in 3b. The transcript detected with the fimU- 

specific probe comigrated with tRNA, suggesting that the product of fimU from S. enteritidis was indeed 
a tRNA molecule. 

FIG. 3. Northern blot analysis of tRNA Ar S (UCU) 
production in S. enteritidis 3b strains. A JimU-specific probe 
was hybridized to PAGE-separated total RNA from 
S. enteritidis 3b (lanes 1 and 5), 3b- 122 (lanes 2 and 6), 3b- 
122 pGEM-Tl (lanes 3 and 7), or 3b-122 pLU/TA 4-1 (lanes 
4 and 8). Lanes 1 to 4 contain 25 (ig of RNA, and lanes 5 to 
8 contain 50 ug of RNA. 

Complementation of fimbrin expression and fimbria assembly in 3b-122. Fimbria expression was 
examined in S. enteritidis 3b, 3b- 122, 3b- 122 pGEM-Tl, and 3b- 122 LU/TA 4-1 grown under 
conditions that promoted production of both SEF14 and SEF21 by the wild-type strain (static CFA 
broth, 48 h, 37°C). Western blot analysis of whole-cell lysates using SEF14- or SEF21 -specific antisera 
showed that complementation of the insertion mutation in 3b- 122 with pLU/TA 4-1 restored SEF14 and 
SEF21 fimbria expression (Fig. 4). Thus fimU affected the production of two fimbrins encoded by genes 
located on two different gene clusters. 

FIG. 4. Complementation analysis of S. enteritidis 3b- 122 
TnlO mutant with the y/w£/-containing recombinant plasmid 
pLU/TA 4-1. Whole-cell extracts were analyzed by Western 
blotting to determine the presence of SefA (21 kDa) and 
FimA (14 kDa) fimbrin proteins in S. enteritidis 3b (lane 1), 
3b- 122 (lane 2), 3b- 122 pGEM-Tl (lane 3), and 3b- 122 
pLU/TA 4-1 (lane 4). Numbers at right indicate positions of 
SefA (21) and FimA (14). 
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Assembly of SEF14 and SEF21 fimbriae on the cell surface of S. enteritidis 3b, 3b-122, 3b-122 pGEM- 

Tl, or 3b-122 pLU/TA 4-1 was determined by immunogold labelling and electron microscopy 

performed on cells grown in static CFA broth for 48 h at 37°C. The wild-type strain, S. enteritidis 3b, 

expressed both SefA and FimA and assembled the respective subunits into SEF14 and SEF21 fimbriae 

(Table 2). Similar analyses of 3b-122 and 3b-122 pGEM-Tl showed that SEF14 was not produced and 

that SEF21 fimbriae on the cell surfaces of these two strains were rarely detected (Table 2), a result 

consistent with the Western blot data (Fig. 4). In contrast, SEF14 and SEF21 fimbriae were evident on 

the surface of 3b-122 pLU/TA 4-1 (Fig. 5) at levels equal to or greater than that on 3b. Thus, expression 

of SefA and FimA fimbrins and assembly of their respective fimbriae were restored by complementation 

of the TnlOfimU mutation in 3b-122 with a wild-type copy of the fimUgem on pLU/TA 4-L Cells 

producing SEF17 were readily seen without immunolabelling on all the grids prepared for electron 
microscopy (Table 2). 



TABLE 2. Detection of assembled SEF14 and SEF21 fimbriae in various 
View this table: S. enteritidis strains by immunoelectron microscopy with specific 
fin this window] antifimbrial sera 
[in a new window] 



FIG. 5. Analysis of SEF 1 4 and SEF2 1 fimbria assembly in 
S. enteritidis 3b- 122 pLU/TA 4-1 by immunogold electron 
microscopy. & enteritidis 3 b- 122 pLU/TA 4-1 was labeled 
with protein A-gold and negatively stained following 
incubation with immune serum generated to SEF 14 (A) or 
SEF2 1 (B). Arrows indicate individual immunogold-labeled 
SEF 14 and SEF21 fimbriae in panel A and B insets, 
respectively. The average diameter of the gold particles was 
15 nm. Bar, 0.5 urn (electron micrograph) or 50 nm (inset). 



Analysis of sefA transcription. The effect of tRNA Ar § (UCU) on sefA transcript production was 
analyzed by hybridizing a ^-specific probe to a blot containing total cellular RNA isolated from 
S. enteritidis 3b, 3b-122, 3b-122 pGEM-Tl, or 3b-122 LU/TA 4-1 grown under conditions optimal for 
SEF 14 production in 3b (static CFA broth, 48 h, 37°C). The ^-specific probe hybridized to a 660- 
base transcript that was present in RNA from 3b and 3b- 122 pLU/TA 4-1 but absent from RNA from 
3b-122 and 3b-122 pGEM-Tl (Fig. 6). The strains expressing the sefA transcript corresponded to those 
carrying a functional flmU gene, suggesting that fimU was required for expression of sefA. 

FIG. 6. Northern blot analysis of sefA transcription in 
S. enteritidis 3b strains. A -specific probe was hybridized 
to 10 ug of total RNA from 5. enteritidis 3b (lane 1), 3b- 122 
(lane 2), 3b-122 pGEM-Tl (lane 3), or 3b-122 LU/TA 4- 
1 (lane 4). 
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► DISCUSSION 



fimU, located in the fim (sefil) operon of 5. enteritidis 3b, encodes an 
arginine-specific tRNA that is required for expression of not only SEF21 
fimbriae (type 1) but also SEF14 fimbriae. The product of fimU in 3b is a 
tRNA, since Northern blot analysis of RNA from 3b and 3b-122 pLU/TA 4- 
1 demonstrates that the fimU transcript comigrates on polyacrylamide gels 
with tRNA. Furthermore, the 77-nucleotide sequence of the ^mt/gene of 3b is 
identical to that of fimU of S. typhimurium (39) and shares extensive homology with that of argU 
encoding tRNA Ar § (UCU) from E. coli (18). The ^/mtZ-encoded transcript from 3b can be folded into a 
typical tRNA cloverleaf structure containing the 3'-terminal sequence CCA as well as the invariant or 
semivariant nucleotides common to tRNA molecules (19, 34). Finally, the DNA sequence 5' to the fimU 
gene contains features common to the promoters of tRNA operons including the consensus E. coli ~\0 
and -35 promoter elements (16, 21) and a G+C-rich discriminator sequence (16, 42, 43). The regulatory 
mechanisms controllingy?wC/ expression are unknown. Recently, however, the integration and excision 
of plasmids, phage, and pathogenicity islands into and out of the chromosomes at tRNA loci have been 
shown to affect tRNA gene expression (20, 33). As shown with the E. coli tRNA gene argU (25), the 
open reading frame adjacent to and overlapping^ *7 is a homolog of the integrase gene {int) from the 
defective lambdoid prophage DLP12. Integration of prophage DLP12 at this site prevents 
cotranscription of the int gene with fimU, which may contribute to the regulation of fimU expression in 
S. enteritidis 3b. 

The influence of tRNA Ar S (UCU) encoded by fimU on SEF14 and SEF21 fimbria production is evident 
in the Tn7 0 insertion mutant £ enteritidis 3b- 122. The transposon, inserted between the predicted 
promoter and the 5* end of the mature flmU transcript, disrupts transcription of fimU and thus tRNA Ar S 
(UCU) production. This mutation results in the loss of SefA production and selective loss of FimA 
production, i.e., the subunits of SEF14 and SEF21 (type 1) fimbriae, respectively. Thus, in S. enteritidis 
3b, tRNA Ar S (UCU) is required for SEF14 production and enhances type 1 fimbria (SEF21) production. 
In E. coli, cross-talk has also been reported to occur between adhesin gene clusters (29), and tRNA 
molecules have been shown to play a key role in global regulatory cascades (20), However, this is the 
first study to show that a tRNA-specific locus found on one fimbrial operon influences the production of 
two fimbrins whose operons are separated by 15 C's on the chromosome. 

tRNA Ar 8 (UCU) encoded by flmU is required for transcription of sefA, the gene encoding the subunit of 
SEF14 fimbriae in S. enteritidis 3b. The regulatory mechanism is unknown, but a direct correlation 
between the abundance of tRNAs and the occurrence of the respective codons in protein genes (22, 23) 
has been suggested to control the translation of genes containing rare codons (3, 37). Since AGA, the 
codon recognized by the tRNA Arg (UCU) species encoded by fimU, is one of the least-used codons for 
arginine, then perhaps the limited availability of charged tRNAs for this minor codon controls the level 
of translation of the se/A transcript or of a transcript whose protein product is involved in the regulation 
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of sefA transcription. sefA contains neither of the rare arginine codons AG A or AGG recognized by 
tRNA Arg (UCU), indicating that fimU expression would not have a direct effect on the translation of 
sefA mRNA. However, sefE, the gene encoding the putative AraC-like transcriptional activator of the 
sefl4 gene cluster, contains 13 arginine codons, including 9 AGA codons and 1 AGG codon (5). Perhaps 
the tRNA Arg (UCU) encoded by fimU regulates translation of sefE, which would in turn affect 
transcription of sefA and the downstream genes. 

With the exception of the gene fimA encoding the subunit of SEF21 fimbriae (12), the remainder of the 
sefll gene cluster has not been characterized in S. enteritidis 3b. Thus, the mechanism for regulation of 
type 1 fimbria synthesis by fimU remains to be determined. However, type 1 fimbria (SEF2 1 ) production 
is optimal when S. enteritidis 3b is grown at 37°C in a static broth culture but suboptimal when the cells 
are grown at lower temperatures (21 to 30°C) in shaking broth culture or on solid medium (12). 
Similarly, expression of SEF14 fimbriae by S. enteritidis 3b is environmentally controlled by 
temperature, medium composition, and aeration, and is optimal at 37°C in static, aerobic CFA broth (5). 
Thus, the coregulation of SefA and FimA fimbrin production by ^mt/-encoded tRNA Ar S (UCU) results 
in the corresponding fimbriae being expressed under similar environmental conditions, which may give 
the bacteria a competitive advantage for survival. 
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